summaryrefslogtreecommitdiff
path: root/libgomp/libgomp.texi
diff options
context:
space:
mode:
authorMartin Liska <mliska@suse.cz>2022-11-07 13:23:41 +0100
committerMartin Liska <mliska@suse.cz>2022-11-09 09:00:35 +0100
commit54ca4eef58661a7d7a511e2bbbe309bde1732abf (patch)
tree4f9067b036a4e7c08d0d483246cb5ab5a0d60d41 /libgomp/libgomp.texi
parent564a805f9f08b4346a854ab8dca2e5b561a7a28e (diff)
downloadgcc-54ca4eef58661a7d7a511e2bbbe309bde1732abf.tar.gz
sphinx: remove texinfo files
gcc/d/ChangeLog: * gdc.texi: Removed. gcc/ChangeLog: * doc/analyzer.texi: Removed. * doc/avr-mmcu.texi: Removed. * doc/bugreport.texi: Removed. * doc/cfg.texi: Removed. * doc/collect2.texi: Removed. * doc/compat.texi: Removed. * doc/configfiles.texi: Removed. * doc/configterms.texi: Removed. * doc/contrib.texi: Removed. * doc/contribute.texi: Removed. * doc/cpp.texi: Removed. * doc/cppdiropts.texi: Removed. * doc/cppenv.texi: Removed. * doc/cppinternals.texi: Removed. * doc/cppopts.texi: Removed. * doc/cppwarnopts.texi: Removed. * doc/extend.texi: Removed. * doc/fragments.texi: Removed. * doc/frontends.texi: Removed. * doc/gcc.texi: Removed. * doc/gccint.texi: Removed. * doc/gcov-dump.texi: Removed. * doc/gcov-tool.texi: Removed. * doc/gcov.texi: Removed. * doc/generic.texi: Removed. * doc/gimple.texi: Removed. * doc/gnu.texi: Removed. * doc/gty.texi: Removed. * doc/headerdirs.texi: Removed. * doc/hostconfig.texi: Removed. * doc/implement-c.texi: Removed. * doc/implement-cxx.texi: Removed. * doc/include/fdl.texi: Removed. * doc/include/funding.texi: Removed. * doc/include/gcc-common.texi: Removed. * doc/include/gpl_v3.texi: Removed. * doc/install.texi: Removed. * doc/interface.texi: Removed. * doc/invoke.texi: Removed. * doc/languages.texi: Removed. * doc/libgcc.texi: Removed. * doc/loop.texi: Removed. * doc/lto-dump.texi: Removed. * doc/lto.texi: Removed. * doc/makefile.texi: Removed. * doc/match-and-simplify.texi: Removed. * doc/md.texi: Removed. * doc/objc.texi: Removed. * doc/optinfo.texi: Removed. * doc/options.texi: Removed. * doc/passes.texi: Removed. * doc/plugins.texi: Removed. * doc/poly-int.texi: Removed. * doc/portability.texi: Removed. * doc/rtl.texi: Removed. * doc/service.texi: Removed. * doc/sourcebuild.texi: Removed. * doc/standards.texi: Removed. * doc/tm.texi: Removed. * doc/tree-ssa.texi: Removed. * doc/trouble.texi: Removed. * doc/ux.texi: Removed. * doc/tm.texi.in: Removed. gcc/fortran/ChangeLog: * gfc-internals.texi: Removed. * gfortran.texi: Removed. * intrinsic.texi: Removed. * invoke.texi: Removed. gcc/go/ChangeLog: * gccgo.texi: Removed. libgomp/ChangeLog: * libgomp.texi: Removed. libiberty/ChangeLog: * at-file.texi: Removed. * copying-lib.texi: Removed. * functions.texi: Removed. * libiberty.texi: Removed. * obstacks.texi: Removed. libitm/ChangeLog: * libitm.texi: Removed. libquadmath/ChangeLog: * libquadmath.texi: Removed.
Diffstat (limited to 'libgomp/libgomp.texi')
-rw-r--r--libgomp/libgomp.texi4884
1 files changed, 0 insertions, 4884 deletions
diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi
deleted file mode 100644
index 10fefa97922..00000000000
--- a/libgomp/libgomp.texi
+++ /dev/null
@@ -1,4884 +0,0 @@
-\input texinfo @c -*-texinfo-*-
-
-@c %**start of header
-@setfilename libgomp.info
-@settitle GNU libgomp
-@c %**end of header
-
-
-@copying
-Copyright @copyright{} 2006-2022 Free Software Foundation, Inc.
-
-Permission is granted to copy, distribute and/or modify this document
-under the terms of the GNU Free Documentation License, Version 1.3 or
-any later version published by the Free Software Foundation; with the
-Invariant Sections being ``Funding Free Software'', the Front-Cover
-texts being (a) (see below), and with the Back-Cover Texts being (b)
-(see below). A copy of the license is included in the section entitled
-``GNU Free Documentation License''.
-
-(a) The FSF's Front-Cover Text is:
-
- A GNU Manual
-
-(b) The FSF's Back-Cover Text is:
-
- You have freedom to copy and modify this GNU Manual, like GNU
- software. Copies published by the Free Software Foundation raise
- funds for GNU development.
-@end copying
-
-@ifinfo
-@dircategory GNU Libraries
-@direntry
-* libgomp: (libgomp). GNU Offloading and Multi Processing Runtime Library.
-@end direntry
-
-This manual documents libgomp, the GNU Offloading and Multi Processing
-Runtime library. This is the GNU implementation of the OpenMP and
-OpenACC APIs for parallel and accelerator programming in C/C++ and
-Fortran.
-
-Published by the Free Software Foundation
-51 Franklin Street, Fifth Floor
-Boston, MA 02110-1301 USA
-
-@insertcopying
-@end ifinfo
-
-
-@setchapternewpage odd
-
-@titlepage
-@title GNU Offloading and Multi Processing Runtime Library
-@subtitle The GNU OpenMP and OpenACC Implementation
-@page
-@vskip 0pt plus 1filll
-@comment For the @value{version-GCC} Version*
-@sp 1
-Published by the Free Software Foundation @*
-51 Franklin Street, Fifth Floor@*
-Boston, MA 02110-1301, USA@*
-@sp 1
-@insertcopying
-@end titlepage
-
-@summarycontents
-@contents
-@page
-
-
-@node Top, Enabling OpenMP
-@top Introduction
-@cindex Introduction
-
-This manual documents the usage of libgomp, the GNU Offloading and
-Multi Processing Runtime Library. This includes the GNU
-implementation of the @uref{https://www.openmp.org, OpenMP} Application
-Programming Interface (API) for multi-platform shared-memory parallel
-programming in C/C++ and Fortran, and the GNU implementation of the
-@uref{https://www.openacc.org, OpenACC} Application Programming
-Interface (API) for offloading of code to accelerator devices in C/C++
-and Fortran.
-
-Originally, libgomp implemented the GNU OpenMP Runtime Library. Based
-on this, support for OpenACC and offloading (both OpenACC and OpenMP
-4's target construct) has been added later on, and the library's name
-changed to GNU Offloading and Multi Processing Runtime Library.
-
-
-
-@comment
-@comment When you add a new menu item, please keep the right hand
-@comment aligned to the same column. Do not use tabs. This provides
-@comment better formatting.
-@comment
-@menu
-* Enabling OpenMP:: How to enable OpenMP for your applications.
-* OpenMP Implementation Status:: List of implemented features by OpenMP version
-* OpenMP Runtime Library Routines: Runtime Library Routines.
- The OpenMP runtime application programming
- interface.
-* OpenMP Environment Variables: Environment Variables.
- Influencing OpenMP runtime behavior with
- environment variables.
-* Enabling OpenACC:: How to enable OpenACC for your
- applications.
-* OpenACC Runtime Library Routines:: The OpenACC runtime application
- programming interface.
-* OpenACC Environment Variables:: Influencing OpenACC runtime behavior with
- environment variables.
-* CUDA Streams Usage:: Notes on the implementation of
- asynchronous operations.
-* OpenACC Library Interoperability:: OpenACC library interoperability with the
- NVIDIA CUBLAS library.
-* OpenACC Profiling Interface::
-* OpenMP-Implementation Specifics:: Notes specifics of this OpenMP
- implementation
-* Offload-Target Specifics:: Notes on offload-target specific internals
-* The libgomp ABI:: Notes on the external ABI presented by libgomp.
-* Reporting Bugs:: How to report bugs in the GNU Offloading and
- Multi Processing Runtime Library.
-* Copying:: GNU general public license says
- how you can copy and share libgomp.
-* GNU Free Documentation License::
- How you can copy and share this manual.
-* Funding:: How to help assure continued work for free
- software.
-* Library Index:: Index of this documentation.
-@end menu
-
-
-@c ---------------------------------------------------------------------
-@c Enabling OpenMP
-@c ---------------------------------------------------------------------
-
-@node Enabling OpenMP
-@chapter Enabling OpenMP
-
-To activate the OpenMP extensions for C/C++ and Fortran, the compile-time
-flag @command{-fopenmp} must be specified. This enables the OpenMP directive
-@code{#pragma omp} in C/C++ and @code{!$omp} directives in free form,
-@code{c$omp}, @code{*$omp} and @code{!$omp} directives in fixed form,
-@code{!$} conditional compilation sentinels in free form and @code{c$},
-@code{*$} and @code{!$} sentinels in fixed form, for Fortran. The flag also
-arranges for automatic linking of the OpenMP runtime library
-(@ref{Runtime Library Routines}).
-
-A complete description of all OpenMP directives may be found in the
-@uref{https://www.openmp.org, OpenMP Application Program Interface} manuals.
-See also @ref{OpenMP Implementation Status}.
-
-
-@c ---------------------------------------------------------------------
-@c OpenMP Implementation Status
-@c ---------------------------------------------------------------------
-
-@node OpenMP Implementation Status
-@chapter OpenMP Implementation Status
-
-@menu
-* OpenMP 4.5:: Feature completion status to 4.5 specification
-* OpenMP 5.0:: Feature completion status to 5.0 specification
-* OpenMP 5.1:: Feature completion status to 5.1 specification
-* OpenMP 5.2:: Feature completion status to 5.2 specification
-@end menu
-
-The @code{_OPENMP} preprocessor macro and Fortran's @code{openmp_version}
-parameter, provided by @code{omp_lib.h} and the @code{omp_lib} module, have
-the value @code{201511} (i.e. OpenMP 4.5).
-
-@node OpenMP 4.5
-@section OpenMP 4.5
-
-The OpenMP 4.5 specification is fully supported.
-
-@node OpenMP 5.0
-@section OpenMP 5.0
-
-@unnumberedsubsec New features listed in Appendix B of the OpenMP specification
-@c This list is sorted as in OpenMP 5.1's B.3 not as in OpenMP 5.0's B.2
-
-@multitable @columnfractions .60 .10 .25
-@headitem Description @tab Status @tab Comments
-@item Array shaping @tab N @tab
-@item Array sections with non-unit strides in C and C++ @tab N @tab
-@item Iterators @tab Y @tab
-@item @code{metadirective} directive @tab N @tab
-@item @code{declare variant} directive
- @tab P @tab @emph{simd} traits not handled correctly
-@item @emph{target-offload-var} ICV and @code{OMP_TARGET_OFFLOAD}
- env variable @tab Y @tab
-@item Nested-parallel changes to @emph{max-active-levels-var} ICV @tab Y @tab
-@item @code{requires} directive @tab P
- @tab complete but no non-host devices provides @code{unified_address},
- @code{unified_shared_memory} or @code{reverse_offload}
-@item @code{teams} construct outside an enclosing target region @tab Y @tab
-@item Non-rectangular loop nests @tab Y @tab
-@item @code{!=} as relational-op in canonical loop form for C/C++ @tab Y @tab
-@item @code{nonmonotonic} as default loop schedule modifier for worksharing-loop
- constructs @tab Y @tab
-@item Collapse of associated loops that are imperfectly nested loops @tab N @tab
-@item Clauses @code{if}, @code{nontemporal} and @code{order(concurrent)} in
- @code{simd} construct @tab Y @tab
-@item @code{atomic} constructs in @code{simd} @tab Y @tab
-@item @code{loop} construct @tab Y @tab
-@item @code{order(concurrent)} clause @tab Y @tab
-@item @code{scan} directive and @code{in_scan} modifier for the
- @code{reduction} clause @tab Y @tab
-@item @code{in_reduction} clause on @code{task} constructs @tab Y @tab
-@item @code{in_reduction} clause on @code{target} constructs @tab P
- @tab @code{nowait} only stub
-@item @code{task_reduction} clause with @code{taskgroup} @tab Y @tab
-@item @code{task} modifier to @code{reduction} clause @tab Y @tab
-@item @code{affinity} clause to @code{task} construct @tab Y @tab Stub only
-@item @code{detach} clause to @code{task} construct @tab Y @tab
-@item @code{omp_fulfill_event} runtime routine @tab Y @tab
-@item @code{reduction} and @code{in_reduction} clauses on @code{taskloop}
- and @code{taskloop simd} constructs @tab Y @tab
-@item @code{taskloop} construct cancelable by @code{cancel} construct
- @tab Y @tab
-@item @code{mutexinoutset} @emph{dependence-type} for @code{depend} clause
- @tab Y @tab
-@item Predefined memory spaces, memory allocators, allocator traits
- @tab Y @tab Some are only stubs
-@item Memory management routines @tab Y @tab
-@item @code{allocate} directive @tab N @tab
-@item @code{allocate} clause @tab P @tab Initial support
-@item @code{use_device_addr} clause on @code{target data} @tab Y @tab
-@item @code{ancestor} modifier on @code{device} clause
- @tab Y @tab See comment for @code{requires}
-@item Implicit declare target directive @tab Y @tab
-@item Discontiguous array section with @code{target update} construct
- @tab N @tab
-@item C/C++'s lvalue expressions in @code{to}, @code{from}
- and @code{map} clauses @tab N @tab
-@item C/C++'s lvalue expressions in @code{depend} clauses @tab Y @tab
-@item Nested @code{declare target} directive @tab Y @tab
-@item Combined @code{master} constructs @tab Y @tab
-@item @code{depend} clause on @code{taskwait} @tab Y @tab
-@item Weak memory ordering clauses on @code{atomic} and @code{flush} construct
- @tab Y @tab
-@item @code{hint} clause on the @code{atomic} construct @tab Y @tab Stub only
-@item @code{depobj} construct and depend objects @tab Y @tab
-@item Lock hints were renamed to synchronization hints @tab Y @tab
-@item @code{conditional} modifier to @code{lastprivate} clause @tab Y @tab
-@item Map-order clarifications @tab P @tab
-@item @code{close} @emph{map-type-modifier} @tab Y @tab
-@item Mapping C/C++ pointer variables and to assign the address of
- device memory mapped by an array section @tab P @tab
-@item Mapping of Fortran pointer and allocatable variables, including pointer
- and allocatable components of variables
- @tab P @tab Mapping of vars with allocatable components unsupported
-@item @code{defaultmap} extensions @tab Y @tab
-@item @code{declare mapper} directive @tab N @tab
-@item @code{omp_get_supported_active_levels} routine @tab Y @tab
-@item Runtime routines and environment variables to display runtime thread
- affinity information @tab Y @tab
-@item @code{omp_pause_resource} and @code{omp_pause_resource_all} runtime
- routines @tab Y @tab
-@item @code{omp_get_device_num} runtime routine @tab Y @tab
-@item OMPT interface @tab N @tab
-@item OMPD interface @tab N @tab
-@end multitable
-
-@unnumberedsubsec Other new OpenMP 5.0 features
-
-@multitable @columnfractions .60 .10 .25
-@headitem Description @tab Status @tab Comments
-@item Supporting C++'s range-based for loop @tab Y @tab
-@end multitable
-
-
-@node OpenMP 5.1
-@section OpenMP 5.1
-
-@unnumberedsubsec New features listed in Appendix B of the OpenMP specification
-
-@multitable @columnfractions .60 .10 .25
-@headitem Description @tab Status @tab Comments
-@item OpenMP directive as C++ attribute specifiers @tab Y @tab
-@item @code{omp_all_memory} reserved locator @tab Y @tab
-@item @emph{target_device trait} in OpenMP Context @tab N @tab
-@item @code{target_device} selector set in context selectors @tab N @tab
-@item C/C++'s @code{declare variant} directive: elision support of
- preprocessed code @tab N @tab
-@item @code{declare variant}: new clauses @code{adjust_args} and
- @code{append_args} @tab N @tab
-@item @code{dispatch} construct @tab N @tab
-@item device-specific ICV settings with environment variables @tab Y @tab
-@item @code{assume} directive @tab Y @tab
-@item @code{nothing} directive @tab Y @tab
-@item @code{error} directive @tab Y @tab
-@item @code{masked} construct @tab Y @tab
-@item @code{scope} directive @tab Y @tab
-@item Loop transformation constructs @tab N @tab
-@item @code{strict} modifier in the @code{grainsize} and @code{num_tasks}
- clauses of the @code{taskloop} construct @tab Y @tab
-@item @code{align} clause/modifier in @code{allocate} directive/clause
- and @code{allocator} directive @tab P @tab C/C++ on clause only
-@item @code{thread_limit} clause to @code{target} construct @tab Y @tab
-@item @code{has_device_addr} clause to @code{target} construct @tab Y @tab
-@item Iterators in @code{target update} motion clauses and @code{map}
- clauses @tab N @tab
-@item Indirect calls to the device version of a procedure or function in
- @code{target} regions @tab N @tab
-@item @code{interop} directive @tab N @tab
-@item @code{omp_interop_t} object support in runtime routines @tab N @tab
-@item @code{nowait} clause in @code{taskwait} directive @tab Y @tab
-@item Extensions to the @code{atomic} directive @tab Y @tab
-@item @code{seq_cst} clause on a @code{flush} construct @tab Y @tab
-@item @code{inoutset} argument to the @code{depend} clause @tab Y @tab
-@item @code{private} and @code{firstprivate} argument to @code{default}
- clause in C and C++ @tab Y @tab
-@item @code{present} argument to @code{defaultmap} clause @tab N @tab
-@item @code{omp_set_num_teams}, @code{omp_set_teams_thread_limit},
- @code{omp_get_max_teams}, @code{omp_get_teams_thread_limit} runtime
- routines @tab Y @tab
-@item @code{omp_target_is_accessible} runtime routine @tab Y @tab
-@item @code{omp_target_memcpy_async} and @code{omp_target_memcpy_rect_async}
- runtime routines @tab Y @tab
-@item @code{omp_get_mapped_ptr} runtime routine @tab Y @tab
-@item @code{omp_calloc}, @code{omp_realloc}, @code{omp_aligned_alloc} and
- @code{omp_aligned_calloc} runtime routines @tab Y @tab
-@item @code{omp_alloctrait_key_t} enum: @code{omp_atv_serialized} added,
- @code{omp_atv_default} changed @tab Y @tab
-@item @code{omp_display_env} runtime routine @tab Y @tab
-@item @code{ompt_scope_endpoint_t} enum: @code{ompt_scope_beginend} @tab N @tab
-@item @code{ompt_sync_region_t} enum additions @tab N @tab
-@item @code{ompt_state_t} enum: @code{ompt_state_wait_barrier_implementation}
- and @code{ompt_state_wait_barrier_teams} @tab N @tab
-@item @code{ompt_callback_target_data_op_emi_t},
- @code{ompt_callback_target_emi_t}, @code{ompt_callback_target_map_emi_t}
- and @code{ompt_callback_target_submit_emi_t} @tab N @tab
-@item @code{ompt_callback_error_t} type @tab N @tab
-@item @code{OMP_PLACES} syntax extensions @tab Y @tab
-@item @code{OMP_NUM_TEAMS} and @code{OMP_TEAMS_THREAD_LIMIT} environment
- variables @tab Y @tab
-@end multitable
-
-@unnumberedsubsec Other new OpenMP 5.1 features
-
-@multitable @columnfractions .60 .10 .25
-@headitem Description @tab Status @tab Comments
-@item Support of strictly structured blocks in Fortran @tab Y @tab
-@item Support of structured block sequences in C/C++ @tab Y @tab
-@item @code{unconstrained} and @code{reproducible} modifiers on @code{order}
- clause @tab Y @tab
-@item Support @code{begin/end declare target} syntax in C/C++ @tab Y @tab
-@item Pointer predetermined firstprivate getting initialized
-to address of matching mapped list item per 5.1, Sect. 2.21.7.2 @tab N @tab
-@item For Fortran, diagnose placing declarative before/between @code{USE},
- @code{IMPORT}, and @code{IMPLICIT} as invalid @tab N @tab
-@end multitable
-
-
-@node OpenMP 5.2
-@section OpenMP 5.2
-
-@unnumberedsubsec New features listed in Appendix B of the OpenMP specification
-
-@multitable @columnfractions .60 .10 .25
-@headitem Description @tab Status @tab Comments
-@item @code{omp_in_explicit_task} routine and @emph{explicit-task-var} ICV
- @tab Y @tab
-@item @code{omp}/@code{ompx}/@code{omx} sentinels and @code{omp_}/@code{ompx_}
- namespaces @tab N/A
- @tab warning for @code{ompx/omx} sentinels@footnote{The @code{ompx}
- sentinel as C/C++ pragma and C++ attributes are warned for with
- @code{-Wunknown-pragmas} (implied by @code{-Wall}) and @code{-Wattributes}
- (enabled by default), respectively; for Fortran free-source code, there is
- a warning enabled by default and, for fixed-source code, the @code{omx}
- sentinel is warned for with with @code{-Wsurprising} (enabled by
- @code{-Wall}). Unknown clauses are always rejected with an error.}
-@item Clauses on @code{end} directive can be on directive @tab N @tab
-@item Deprecation of no-argument @code{destroy} clause on @code{depobj}
- @tab N @tab
-@item @code{linear} clause syntax changes and @code{step} modifier @tab Y @tab
-@item Deprecation of minus operator for reductions @tab N @tab
-@item Deprecation of separating @code{map} modifiers without comma @tab N @tab
-@item @code{declare mapper} with iterator and @code{present} modifiers
- @tab N @tab
-@item If a matching mapped list item is not found in the data environment, the
- pointer retains its original value @tab N @tab
-@item New @code{enter} clause as alias for @code{to} on declare target directive
- @tab Y @tab
-@item Deprecation of @code{to} clause on declare target directive @tab N @tab
-@item Extended list of directives permitted in Fortran pure procedures
- @tab N @tab
-@item New @code{allocators} directive for Fortran @tab N @tab
-@item Deprecation of @code{allocate} directive for Fortran
- allocatables/pointers @tab N @tab
-@item Optional paired @code{end} directive with @code{dispatch} @tab N @tab
-@item New @code{memspace} and @code{traits} modifiers for @code{uses_allocators}
- @tab N @tab
-@item Deprecation of traits array following the allocator_handle expression in
- @code{uses_allocators} @tab N @tab
-@item New @code{otherwise} clause as alias for @code{default} on metadirectives
- @tab N @tab
-@item Deprecation of @code{default} clause on metadirectives @tab N @tab
-@item Deprecation of delimited form of @code{declare target} @tab N @tab
-@item Reproducible semantics changed for @code{order(concurrent)} @tab N @tab
-@item @code{allocate} and @code{firstprivate} clauses on @code{scope}
- @tab Y @tab
-@item @code{ompt_callback_work} @tab N @tab
-@item Default map-type for @code{map} clause in @code{target enter/exit data}
- @tab Y @tab
-@item New @code{doacross} clause as alias for @code{depend} with
- @code{source}/@code{sink} modifier @tab Y @tab
-@item Deprecation of @code{depend} with @code{source}/@code{sink} modifier
- @tab N @tab
-@item @code{omp_cur_iteration} keyword @tab Y @tab
-@end multitable
-
-@unnumberedsubsec Other new OpenMP 5.2 features
-
-@multitable @columnfractions .60 .10 .25
-@headitem Description @tab Status @tab Comments
-@item For Fortran, optional comma between directive and clause @tab N @tab
-@item Conforming device numbers and @code{omp_initial_device} and
- @code{omp_invalid_device} enum/PARAMETER @tab Y @tab
-@item Initial value of @emph{default-device-var} ICV with
- @code{OMP_TARGET_OFFLOAD=mandatory} @tab N @tab
-@item @emph{interop_types} in any position of the modifier list for the @code{init} clause
- of the @code{interop} construct @tab N @tab
-@end multitable
-
-
-@c ---------------------------------------------------------------------
-@c OpenMP Runtime Library Routines
-@c ---------------------------------------------------------------------
-
-@node Runtime Library Routines
-@chapter OpenMP Runtime Library Routines
-
-The runtime routines described here are defined by Section 3 of the OpenMP
-specification in version 4.5. The routines are structured in following
-three parts:
-
-@menu
-Control threads, processors and the parallel environment. They have C
-linkage, and do not throw exceptions.
-
-* omp_get_active_level:: Number of active parallel regions
-* omp_get_ancestor_thread_num:: Ancestor thread ID
-* omp_get_cancellation:: Whether cancellation support is enabled
-* omp_get_default_device:: Get the default device for target regions
-* omp_get_device_num:: Get device that current thread is running on
-* omp_get_dynamic:: Dynamic teams setting
-* omp_get_initial_device:: Device number of host device
-* omp_get_level:: Number of parallel regions
-* omp_get_max_active_levels:: Current maximum number of active regions
-* omp_get_max_task_priority:: Maximum task priority value that can be set
-* omp_get_max_teams:: Maximum number of teams for teams region
-* omp_get_max_threads:: Maximum number of threads of parallel region
-* omp_get_nested:: Nested parallel regions
-* omp_get_num_devices:: Number of target devices
-* omp_get_num_procs:: Number of processors online
-* omp_get_num_teams:: Number of teams
-* omp_get_num_threads:: Size of the active team
-* omp_get_proc_bind:: Whether theads may be moved between CPUs
-* omp_get_schedule:: Obtain the runtime scheduling method
-* omp_get_supported_active_levels:: Maximum number of active regions supported
-* omp_get_team_num:: Get team number
-* omp_get_team_size:: Number of threads in a team
-* omp_get_teams_thread_limit:: Maximum number of threads imposed by teams
-* omp_get_thread_limit:: Maximum number of threads
-* omp_get_thread_num:: Current thread ID
-* omp_in_parallel:: Whether a parallel region is active
-* omp_in_final:: Whether in final or included task region
-* omp_is_initial_device:: Whether executing on the host device
-* omp_set_default_device:: Set the default device for target regions
-* omp_set_dynamic:: Enable/disable dynamic teams
-* omp_set_max_active_levels:: Limits the number of active parallel regions
-* omp_set_nested:: Enable/disable nested parallel regions
-* omp_set_num_teams:: Set upper teams limit for teams region
-* omp_set_num_threads:: Set upper team size limit
-* omp_set_schedule:: Set the runtime scheduling method
-* omp_set_teams_thread_limit:: Set upper thread limit for teams construct
-
-Initialize, set, test, unset and destroy simple and nested locks.
-
-* omp_init_lock:: Initialize simple lock
-* omp_set_lock:: Wait for and set simple lock
-* omp_test_lock:: Test and set simple lock if available
-* omp_unset_lock:: Unset simple lock
-* omp_destroy_lock:: Destroy simple lock
-* omp_init_nest_lock:: Initialize nested lock
-* omp_set_nest_lock:: Wait for and set simple lock
-* omp_test_nest_lock:: Test and set nested lock if available
-* omp_unset_nest_lock:: Unset nested lock
-* omp_destroy_nest_lock:: Destroy nested lock
-
-Portable, thread-based, wall clock timer.
-
-* omp_get_wtick:: Get timer precision.
-* omp_get_wtime:: Elapsed wall clock time.
-
-Support for event objects.
-
-* omp_fulfill_event:: Fulfill and destroy an OpenMP event.
-@end menu
-
-
-
-@node omp_get_active_level
-@section @code{omp_get_active_level} -- Number of parallel regions
-@table @asis
-@item @emph{Description}:
-This function returns the nesting level for the active parallel blocks,
-which enclose the calling call.
-
-@item @emph{C/C++}
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{int omp_get_active_level(void);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{integer function omp_get_active_level()}
-@end multitable
-
-@item @emph{See also}:
-@ref{omp_get_level}, @ref{omp_get_max_active_levels}, @ref{omp_set_max_active_levels}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.20.
-@end table
-
-
-
-@node omp_get_ancestor_thread_num
-@section @code{omp_get_ancestor_thread_num} -- Ancestor thread ID
-@table @asis
-@item @emph{Description}:
-This function returns the thread identification number for the given
-nesting level of the current thread. For values of @var{level} outside
-zero to @code{omp_get_level} -1 is returned; if @var{level} is
-@code{omp_get_level} the result is identical to @code{omp_get_thread_num}.
-
-@item @emph{C/C++}
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{int omp_get_ancestor_thread_num(int level);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{integer function omp_get_ancestor_thread_num(level)}
-@item @tab @code{integer level}
-@end multitable
-
-@item @emph{See also}:
-@ref{omp_get_level}, @ref{omp_get_thread_num}, @ref{omp_get_team_size}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.18.
-@end table
-
-
-
-@node omp_get_cancellation
-@section @code{omp_get_cancellation} -- Whether cancellation support is enabled
-@table @asis
-@item @emph{Description}:
-This function returns @code{true} if cancellation is activated, @code{false}
-otherwise. Here, @code{true} and @code{false} represent their language-specific
-counterparts. Unless @env{OMP_CANCELLATION} is set true, cancellations are
-deactivated.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{int omp_get_cancellation(void);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{logical function omp_get_cancellation()}
-@end multitable
-
-@item @emph{See also}:
-@ref{OMP_CANCELLATION}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.9.
-@end table
-
-
-
-@node omp_get_default_device
-@section @code{omp_get_default_device} -- Get the default device for target regions
-@table @asis
-@item @emph{Description}:
-Get the default device for target regions without device clause.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{int omp_get_default_device(void);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{integer function omp_get_default_device()}
-@end multitable
-
-@item @emph{See also}:
-@ref{OMP_DEFAULT_DEVICE}, @ref{omp_set_default_device}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.30.
-@end table
-
-
-
-@node omp_get_device_num
-@section @code{omp_get_device_num} -- Return device number of current device
-@table @asis
-@item @emph{Description}:
-This function returns a device number that represents the device that the
-current thread is executing on. For OpenMP 5.0, this must be equal to the
-value returned by the @code{omp_get_initial_device} function when called
-from the host.
-
-@item @emph{C/C++}
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{int omp_get_device_num(void);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{integer function omp_get_device_num()}
-@end multitable
-
-@item @emph{See also}:
-@ref{omp_get_initial_device}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v5.0}, Section 3.2.37.
-@end table
-
-
-
-@node omp_get_dynamic
-@section @code{omp_get_dynamic} -- Dynamic teams setting
-@table @asis
-@item @emph{Description}:
-This function returns @code{true} if enabled, @code{false} otherwise.
-Here, @code{true} and @code{false} represent their language-specific
-counterparts.
-
-The dynamic team setting may be initialized at startup by the
-@env{OMP_DYNAMIC} environment variable or at runtime using
-@code{omp_set_dynamic}. If undefined, dynamic adjustment is
-disabled by default.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{int omp_get_dynamic(void);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{logical function omp_get_dynamic()}
-@end multitable
-
-@item @emph{See also}:
-@ref{omp_set_dynamic}, @ref{OMP_DYNAMIC}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.8.
-@end table
-
-
-
-@node omp_get_initial_device
-@section @code{omp_get_initial_device} -- Return device number of initial device
-@table @asis
-@item @emph{Description}:
-This function returns a device number that represents the host device.
-For OpenMP 5.1, this must be equal to the value returned by the
-@code{omp_get_num_devices} function.
-
-@item @emph{C/C++}
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{int omp_get_initial_device(void);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{integer function omp_get_initial_device()}
-@end multitable
-
-@item @emph{See also}:
-@ref{omp_get_num_devices}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.35.
-@end table
-
-
-
-@node omp_get_level
-@section @code{omp_get_level} -- Obtain the current nesting level
-@table @asis
-@item @emph{Description}:
-This function returns the nesting level for the parallel blocks,
-which enclose the calling call.
-
-@item @emph{C/C++}
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{int omp_get_level(void);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{integer function omp_level()}
-@end multitable
-
-@item @emph{See also}:
-@ref{omp_get_active_level}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.17.
-@end table
-
-
-
-@node omp_get_max_active_levels
-@section @code{omp_get_max_active_levels} -- Current maximum number of active regions
-@table @asis
-@item @emph{Description}:
-This function obtains the maximum allowed number of nested, active parallel regions.
-
-@item @emph{C/C++}
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{int omp_get_max_active_levels(void);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{integer function omp_get_max_active_levels()}
-@end multitable
-
-@item @emph{See also}:
-@ref{omp_set_max_active_levels}, @ref{omp_get_active_level}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.16.
-@end table
-
-
-@node omp_get_max_task_priority
-@section @code{omp_get_max_task_priority} -- Maximum priority value
-that can be set for tasks.
-@table @asis
-@item @emph{Description}:
-This function obtains the maximum allowed priority number for tasks.
-
-@item @emph{C/C++}
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{int omp_get_max_task_priority(void);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{integer function omp_get_max_task_priority()}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.29.
-@end table
-
-
-@node omp_get_max_teams
-@section @code{omp_get_max_teams} -- Maximum number of teams of teams region
-@table @asis
-@item @emph{Description}:
-Return the maximum number of teams used for the teams region
-that does not use the clause @code{num_teams}.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{int omp_get_max_teams(void);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{integer function omp_get_max_teams()}
-@end multitable
-
-@item @emph{See also}:
-@ref{omp_set_num_teams}, @ref{omp_get_num_teams}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 3.4.4.
-@end table
-
-
-
-@node omp_get_max_threads
-@section @code{omp_get_max_threads} -- Maximum number of threads of parallel region
-@table @asis
-@item @emph{Description}:
-Return the maximum number of threads used for the current parallel region
-that does not use the clause @code{num_threads}.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{int omp_get_max_threads(void);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{integer function omp_get_max_threads()}
-@end multitable
-
-@item @emph{See also}:
-@ref{omp_set_num_threads}, @ref{omp_set_dynamic}, @ref{omp_get_thread_limit}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.3.
-@end table
-
-
-
-@node omp_get_nested
-@section @code{omp_get_nested} -- Nested parallel regions
-@table @asis
-@item @emph{Description}:
-This function returns @code{true} if nested parallel regions are
-enabled, @code{false} otherwise. Here, @code{true} and @code{false}
-represent their language-specific counterparts.
-
-The state of nested parallel regions at startup depends on several
-environment variables. If @env{OMP_MAX_ACTIVE_LEVELS} is defined
-and is set to greater than one, then nested parallel regions will be
-enabled. If not defined, then the value of the @env{OMP_NESTED}
-environment variable will be followed if defined. If neither are
-defined, then if either @env{OMP_NUM_THREADS} or @env{OMP_PROC_BIND}
-are defined with a list of more than one value, then nested parallel
-regions are enabled. If none of these are defined, then nested parallel
-regions are disabled by default.
-
-Nested parallel regions can be enabled or disabled at runtime using
-@code{omp_set_nested}, or by setting the maximum number of nested
-regions with @code{omp_set_max_active_levels} to one to disable, or
-above one to enable.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{int omp_get_nested(void);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{logical function omp_get_nested()}
-@end multitable
-
-@item @emph{See also}:
-@ref{omp_set_max_active_levels}, @ref{omp_set_nested},
-@ref{OMP_MAX_ACTIVE_LEVELS}, @ref{OMP_NESTED}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.11.
-@end table
-
-
-
-@node omp_get_num_devices
-@section @code{omp_get_num_devices} -- Number of target devices
-@table @asis
-@item @emph{Description}:
-Returns the number of target devices.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{int omp_get_num_devices(void);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{integer function omp_get_num_devices()}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.31.
-@end table
-
-
-
-@node omp_get_num_procs
-@section @code{omp_get_num_procs} -- Number of processors online
-@table @asis
-@item @emph{Description}:
-Returns the number of processors online on that device.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{int omp_get_num_procs(void);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{integer function omp_get_num_procs()}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.5.
-@end table
-
-
-
-@node omp_get_num_teams
-@section @code{omp_get_num_teams} -- Number of teams
-@table @asis
-@item @emph{Description}:
-Returns the number of teams in the current team region.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{int omp_get_num_teams(void);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{integer function omp_get_num_teams()}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.32.
-@end table
-
-
-
-@node omp_get_num_threads
-@section @code{omp_get_num_threads} -- Size of the active team
-@table @asis
-@item @emph{Description}:
-Returns the number of threads in the current team. In a sequential section of
-the program @code{omp_get_num_threads} returns 1.
-
-The default team size may be initialized at startup by the
-@env{OMP_NUM_THREADS} environment variable. At runtime, the size
-of the current team may be set either by the @code{NUM_THREADS}
-clause or by @code{omp_set_num_threads}. If none of the above were
-used to define a specific value and @env{OMP_DYNAMIC} is disabled,
-one thread per CPU online is used.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{int omp_get_num_threads(void);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{integer function omp_get_num_threads()}
-@end multitable
-
-@item @emph{See also}:
-@ref{omp_get_max_threads}, @ref{omp_set_num_threads}, @ref{OMP_NUM_THREADS}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.2.
-@end table
-
-
-
-@node omp_get_proc_bind
-@section @code{omp_get_proc_bind} -- Whether theads may be moved between CPUs
-@table @asis
-@item @emph{Description}:
-This functions returns the currently active thread affinity policy, which is
-set via @env{OMP_PROC_BIND}. Possible values are @code{omp_proc_bind_false},
-@code{omp_proc_bind_true}, @code{omp_proc_bind_primary},
-@code{omp_proc_bind_master}, @code{omp_proc_bind_close} and @code{omp_proc_bind_spread},
-where @code{omp_proc_bind_master} is an alias for @code{omp_proc_bind_primary}.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{omp_proc_bind_t omp_get_proc_bind(void);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{integer(kind=omp_proc_bind_kind) function omp_get_proc_bind()}
-@end multitable
-
-@item @emph{See also}:
-@ref{OMP_PROC_BIND}, @ref{OMP_PLACES}, @ref{GOMP_CPU_AFFINITY},
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.22.
-@end table
-
-
-
-@node omp_get_schedule
-@section @code{omp_get_schedule} -- Obtain the runtime scheduling method
-@table @asis
-@item @emph{Description}:
-Obtain the runtime scheduling method. The @var{kind} argument will be
-set to the value @code{omp_sched_static}, @code{omp_sched_dynamic},
-@code{omp_sched_guided} or @code{omp_sched_auto}. The second argument,
-@var{chunk_size}, is set to the chunk size.
-
-@item @emph{C/C++}
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{void omp_get_schedule(omp_sched_t *kind, int *chunk_size);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{subroutine omp_get_schedule(kind, chunk_size)}
-@item @tab @code{integer(kind=omp_sched_kind) kind}
-@item @tab @code{integer chunk_size}
-@end multitable
-
-@item @emph{See also}:
-@ref{omp_set_schedule}, @ref{OMP_SCHEDULE}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.13.
-@end table
-
-
-@node omp_get_supported_active_levels
-@section @code{omp_get_supported_active_levels} -- Maximum number of active regions supported
-@table @asis
-@item @emph{Description}:
-This function returns the maximum number of nested, active parallel regions
-supported by this implementation.
-
-@item @emph{C/C++}
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{int omp_get_supported_active_levels(void);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{integer function omp_get_supported_active_levels()}
-@end multitable
-
-@item @emph{See also}:
-@ref{omp_get_max_active_levels}, @ref{omp_set_max_active_levels}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v5.0}, Section 3.2.15.
-@end table
-
-
-
-@node omp_get_team_num
-@section @code{omp_get_team_num} -- Get team number
-@table @asis
-@item @emph{Description}:
-Returns the team number of the calling thread.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{int omp_get_team_num(void);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{integer function omp_get_team_num()}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.33.
-@end table
-
-
-
-@node omp_get_team_size
-@section @code{omp_get_team_size} -- Number of threads in a team
-@table @asis
-@item @emph{Description}:
-This function returns the number of threads in a thread team to which
-either the current thread or its ancestor belongs. For values of @var{level}
-outside zero to @code{omp_get_level}, -1 is returned; if @var{level} is zero,
-1 is returned, and for @code{omp_get_level}, the result is identical
-to @code{omp_get_num_threads}.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{int omp_get_team_size(int level);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{integer function omp_get_team_size(level)}
-@item @tab @code{integer level}
-@end multitable
-
-@item @emph{See also}:
-@ref{omp_get_num_threads}, @ref{omp_get_level}, @ref{omp_get_ancestor_thread_num}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.19.
-@end table
-
-
-
-@node omp_get_teams_thread_limit
-@section @code{omp_get_teams_thread_limit} -- Maximum number of threads imposed by teams
-@table @asis
-@item @emph{Description}:
-Return the maximum number of threads that will be able to participate in
-each team created by a teams construct.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{int omp_get_teams_thread_limit(void);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{integer function omp_get_teams_thread_limit()}
-@end multitable
-
-@item @emph{See also}:
-@ref{omp_set_teams_thread_limit}, @ref{OMP_TEAMS_THREAD_LIMIT}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 3.4.6.
-@end table
-
-
-
-@node omp_get_thread_limit
-@section @code{omp_get_thread_limit} -- Maximum number of threads
-@table @asis
-@item @emph{Description}:
-Return the maximum number of threads of the program.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{int omp_get_thread_limit(void);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{integer function omp_get_thread_limit()}
-@end multitable
-
-@item @emph{See also}:
-@ref{omp_get_max_threads}, @ref{OMP_THREAD_LIMIT}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.14.
-@end table
-
-
-
-@node omp_get_thread_num
-@section @code{omp_get_thread_num} -- Current thread ID
-@table @asis
-@item @emph{Description}:
-Returns a unique thread identification number within the current team.
-In a sequential parts of the program, @code{omp_get_thread_num}
-always returns 0. In parallel regions the return value varies
-from 0 to @code{omp_get_num_threads}-1 inclusive. The return
-value of the primary thread of a team is always 0.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{int omp_get_thread_num(void);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{integer function omp_get_thread_num()}
-@end multitable
-
-@item @emph{See also}:
-@ref{omp_get_num_threads}, @ref{omp_get_ancestor_thread_num}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.4.
-@end table
-
-
-
-@node omp_in_parallel
-@section @code{omp_in_parallel} -- Whether a parallel region is active
-@table @asis
-@item @emph{Description}:
-This function returns @code{true} if currently running in parallel,
-@code{false} otherwise. Here, @code{true} and @code{false} represent
-their language-specific counterparts.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{int omp_in_parallel(void);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{logical function omp_in_parallel()}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.6.
-@end table
-
-
-@node omp_in_final
-@section @code{omp_in_final} -- Whether in final or included task region
-@table @asis
-@item @emph{Description}:
-This function returns @code{true} if currently running in a final
-or included task region, @code{false} otherwise. Here, @code{true}
-and @code{false} represent their language-specific counterparts.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{int omp_in_final(void);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{logical function omp_in_final()}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.21.
-@end table
-
-
-
-@node omp_is_initial_device
-@section @code{omp_is_initial_device} -- Whether executing on the host device
-@table @asis
-@item @emph{Description}:
-This function returns @code{true} if currently running on the host device,
-@code{false} otherwise. Here, @code{true} and @code{false} represent
-their language-specific counterparts.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{int omp_is_initial_device(void);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{logical function omp_is_initial_device()}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.34.
-@end table
-
-
-
-@node omp_set_default_device
-@section @code{omp_set_default_device} -- Set the default device for target regions
-@table @asis
-@item @emph{Description}:
-Set the default device for target regions without device clause. The argument
-shall be a nonnegative device number.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{void omp_set_default_device(int device_num);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{subroutine omp_set_default_device(device_num)}
-@item @tab @code{integer device_num}
-@end multitable
-
-@item @emph{See also}:
-@ref{OMP_DEFAULT_DEVICE}, @ref{omp_get_default_device}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.29.
-@end table
-
-
-
-@node omp_set_dynamic
-@section @code{omp_set_dynamic} -- Enable/disable dynamic teams
-@table @asis
-@item @emph{Description}:
-Enable or disable the dynamic adjustment of the number of threads
-within a team. The function takes the language-specific equivalent
-of @code{true} and @code{false}, where @code{true} enables dynamic
-adjustment of team sizes and @code{false} disables it.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{void omp_set_dynamic(int dynamic_threads);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{subroutine omp_set_dynamic(dynamic_threads)}
-@item @tab @code{logical, intent(in) :: dynamic_threads}
-@end multitable
-
-@item @emph{See also}:
-@ref{OMP_DYNAMIC}, @ref{omp_get_dynamic}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.7.
-@end table
-
-
-
-@node omp_set_max_active_levels
-@section @code{omp_set_max_active_levels} -- Limits the number of active parallel regions
-@table @asis
-@item @emph{Description}:
-This function limits the maximum allowed number of nested, active
-parallel regions. @var{max_levels} must be less or equal to
-the value returned by @code{omp_get_supported_active_levels}.
-
-@item @emph{C/C++}
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{void omp_set_max_active_levels(int max_levels);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{subroutine omp_set_max_active_levels(max_levels)}
-@item @tab @code{integer max_levels}
-@end multitable
-
-@item @emph{See also}:
-@ref{omp_get_max_active_levels}, @ref{omp_get_active_level},
-@ref{omp_get_supported_active_levels}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.15.
-@end table
-
-
-
-@node omp_set_nested
-@section @code{omp_set_nested} -- Enable/disable nested parallel regions
-@table @asis
-@item @emph{Description}:
-Enable or disable nested parallel regions, i.e., whether team members
-are allowed to create new teams. The function takes the language-specific
-equivalent of @code{true} and @code{false}, where @code{true} enables
-dynamic adjustment of team sizes and @code{false} disables it.
-
-Enabling nested parallel regions will also set the maximum number of
-active nested regions to the maximum supported. Disabling nested parallel
-regions will set the maximum number of active nested regions to one.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{void omp_set_nested(int nested);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{subroutine omp_set_nested(nested)}
-@item @tab @code{logical, intent(in) :: nested}
-@end multitable
-
-@item @emph{See also}:
-@ref{omp_get_nested}, @ref{omp_set_max_active_levels},
-@ref{OMP_MAX_ACTIVE_LEVELS}, @ref{OMP_NESTED}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.10.
-@end table
-
-
-
-@node omp_set_num_teams
-@section @code{omp_set_num_teams} -- Set upper teams limit for teams construct
-@table @asis
-@item @emph{Description}:
-Specifies the upper bound for number of teams created by the teams construct
-which does not specify a @code{num_teams} clause. The
-argument of @code{omp_set_num_teams} shall be a positive integer.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{void omp_set_num_teams(int num_teams);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{subroutine omp_set_num_teams(num_teams)}
-@item @tab @code{integer, intent(in) :: num_teams}
-@end multitable
-
-@item @emph{See also}:
-@ref{OMP_NUM_TEAMS}, @ref{omp_get_num_teams}, @ref{omp_get_max_teams}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 3.4.3.
-@end table
-
-
-
-@node omp_set_num_threads
-@section @code{omp_set_num_threads} -- Set upper team size limit
-@table @asis
-@item @emph{Description}:
-Specifies the number of threads used by default in subsequent parallel
-sections, if those do not specify a @code{num_threads} clause. The
-argument of @code{omp_set_num_threads} shall be a positive integer.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{void omp_set_num_threads(int num_threads);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{subroutine omp_set_num_threads(num_threads)}
-@item @tab @code{integer, intent(in) :: num_threads}
-@end multitable
-
-@item @emph{See also}:
-@ref{OMP_NUM_THREADS}, @ref{omp_get_num_threads}, @ref{omp_get_max_threads}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.1.
-@end table
-
-
-
-@node omp_set_schedule
-@section @code{omp_set_schedule} -- Set the runtime scheduling method
-@table @asis
-@item @emph{Description}:
-Sets the runtime scheduling method. The @var{kind} argument can have the
-value @code{omp_sched_static}, @code{omp_sched_dynamic},
-@code{omp_sched_guided} or @code{omp_sched_auto}. Except for
-@code{omp_sched_auto}, the chunk size is set to the value of
-@var{chunk_size} if positive, or to the default value if zero or negative.
-For @code{omp_sched_auto} the @var{chunk_size} argument is ignored.
-
-@item @emph{C/C++}
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{void omp_set_schedule(omp_sched_t kind, int chunk_size);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{subroutine omp_set_schedule(kind, chunk_size)}
-@item @tab @code{integer(kind=omp_sched_kind) kind}
-@item @tab @code{integer chunk_size}
-@end multitable
-
-@item @emph{See also}:
-@ref{omp_get_schedule}
-@ref{OMP_SCHEDULE}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.2.12.
-@end table
-
-
-
-@node omp_set_teams_thread_limit
-@section @code{omp_set_teams_thread_limit} -- Set upper thread limit for teams construct
-@table @asis
-@item @emph{Description}:
-Specifies the upper bound for number of threads that will be available
-for each team created by the teams construct which does not specify a
-@code{thread_limit} clause. The argument of
-@code{omp_set_teams_thread_limit} shall be a positive integer.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{void omp_set_teams_thread_limit(int thread_limit);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{subroutine omp_set_teams_thread_limit(thread_limit)}
-@item @tab @code{integer, intent(in) :: thread_limit}
-@end multitable
-
-@item @emph{See also}:
-@ref{OMP_TEAMS_THREAD_LIMIT}, @ref{omp_get_teams_thread_limit}, @ref{omp_get_thread_limit}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 3.4.5.
-@end table
-
-
-
-@node omp_init_lock
-@section @code{omp_init_lock} -- Initialize simple lock
-@table @asis
-@item @emph{Description}:
-Initialize a simple lock. After initialization, the lock is in
-an unlocked state.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{void omp_init_lock(omp_lock_t *lock);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{subroutine omp_init_lock(svar)}
-@item @tab @code{integer(omp_lock_kind), intent(out) :: svar}
-@end multitable
-
-@item @emph{See also}:
-@ref{omp_destroy_lock}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.1.
-@end table
-
-
-
-@node omp_set_lock
-@section @code{omp_set_lock} -- Wait for and set simple lock
-@table @asis
-@item @emph{Description}:
-Before setting a simple lock, the lock variable must be initialized by
-@code{omp_init_lock}. The calling thread is blocked until the lock
-is available. If the lock is already held by the current thread,
-a deadlock occurs.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{void omp_set_lock(omp_lock_t *lock);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{subroutine omp_set_lock(svar)}
-@item @tab @code{integer(omp_lock_kind), intent(inout) :: svar}
-@end multitable
-
-@item @emph{See also}:
-@ref{omp_init_lock}, @ref{omp_test_lock}, @ref{omp_unset_lock}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.4.
-@end table
-
-
-
-@node omp_test_lock
-@section @code{omp_test_lock} -- Test and set simple lock if available
-@table @asis
-@item @emph{Description}:
-Before setting a simple lock, the lock variable must be initialized by
-@code{omp_init_lock}. Contrary to @code{omp_set_lock}, @code{omp_test_lock}
-does not block if the lock is not available. This function returns
-@code{true} upon success, @code{false} otherwise. Here, @code{true} and
-@code{false} represent their language-specific counterparts.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{int omp_test_lock(omp_lock_t *lock);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{logical function omp_test_lock(svar)}
-@item @tab @code{integer(omp_lock_kind), intent(inout) :: svar}
-@end multitable
-
-@item @emph{See also}:
-@ref{omp_init_lock}, @ref{omp_set_lock}, @ref{omp_set_lock}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.6.
-@end table
-
-
-
-@node omp_unset_lock
-@section @code{omp_unset_lock} -- Unset simple lock
-@table @asis
-@item @emph{Description}:
-A simple lock about to be unset must have been locked by @code{omp_set_lock}
-or @code{omp_test_lock} before. In addition, the lock must be held by the
-thread calling @code{omp_unset_lock}. Then, the lock becomes unlocked. If one
-or more threads attempted to set the lock before, one of them is chosen to,
-again, set the lock to itself.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{void omp_unset_lock(omp_lock_t *lock);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{subroutine omp_unset_lock(svar)}
-@item @tab @code{integer(omp_lock_kind), intent(inout) :: svar}
-@end multitable
-
-@item @emph{See also}:
-@ref{omp_set_lock}, @ref{omp_test_lock}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.5.
-@end table
-
-
-
-@node omp_destroy_lock
-@section @code{omp_destroy_lock} -- Destroy simple lock
-@table @asis
-@item @emph{Description}:
-Destroy a simple lock. In order to be destroyed, a simple lock must be
-in the unlocked state.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{void omp_destroy_lock(omp_lock_t *lock);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{subroutine omp_destroy_lock(svar)}
-@item @tab @code{integer(omp_lock_kind), intent(inout) :: svar}
-@end multitable
-
-@item @emph{See also}:
-@ref{omp_init_lock}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.3.
-@end table
-
-
-
-@node omp_init_nest_lock
-@section @code{omp_init_nest_lock} -- Initialize nested lock
-@table @asis
-@item @emph{Description}:
-Initialize a nested lock. After initialization, the lock is in
-an unlocked state and the nesting count is set to zero.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{void omp_init_nest_lock(omp_nest_lock_t *lock);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{subroutine omp_init_nest_lock(nvar)}
-@item @tab @code{integer(omp_nest_lock_kind), intent(out) :: nvar}
-@end multitable
-
-@item @emph{See also}:
-@ref{omp_destroy_nest_lock}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.1.
-@end table
-
-
-@node omp_set_nest_lock
-@section @code{omp_set_nest_lock} -- Wait for and set nested lock
-@table @asis
-@item @emph{Description}:
-Before setting a nested lock, the lock variable must be initialized by
-@code{omp_init_nest_lock}. The calling thread is blocked until the lock
-is available. If the lock is already held by the current thread, the
-nesting count for the lock is incremented.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{void omp_set_nest_lock(omp_nest_lock_t *lock);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{subroutine omp_set_nest_lock(nvar)}
-@item @tab @code{integer(omp_nest_lock_kind), intent(inout) :: nvar}
-@end multitable
-
-@item @emph{See also}:
-@ref{omp_init_nest_lock}, @ref{omp_unset_nest_lock}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.4.
-@end table
-
-
-
-@node omp_test_nest_lock
-@section @code{omp_test_nest_lock} -- Test and set nested lock if available
-@table @asis
-@item @emph{Description}:
-Before setting a nested lock, the lock variable must be initialized by
-@code{omp_init_nest_lock}. Contrary to @code{omp_set_nest_lock},
-@code{omp_test_nest_lock} does not block if the lock is not available.
-If the lock is already held by the current thread, the new nesting count
-is returned. Otherwise, the return value equals zero.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{int omp_test_nest_lock(omp_nest_lock_t *lock);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{logical function omp_test_nest_lock(nvar)}
-@item @tab @code{integer(omp_nest_lock_kind), intent(inout) :: nvar}
-@end multitable
-
-
-@item @emph{See also}:
-@ref{omp_init_lock}, @ref{omp_set_lock}, @ref{omp_set_lock}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.6.
-@end table
-
-
-
-@node omp_unset_nest_lock
-@section @code{omp_unset_nest_lock} -- Unset nested lock
-@table @asis
-@item @emph{Description}:
-A nested lock about to be unset must have been locked by @code{omp_set_nested_lock}
-or @code{omp_test_nested_lock} before. In addition, the lock must be held by the
-thread calling @code{omp_unset_nested_lock}. If the nesting count drops to zero, the
-lock becomes unlocked. If one ore more threads attempted to set the lock before,
-one of them is chosen to, again, set the lock to itself.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{void omp_unset_nest_lock(omp_nest_lock_t *lock);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{subroutine omp_unset_nest_lock(nvar)}
-@item @tab @code{integer(omp_nest_lock_kind), intent(inout) :: nvar}
-@end multitable
-
-@item @emph{See also}:
-@ref{omp_set_nest_lock}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.5.
-@end table
-
-
-
-@node omp_destroy_nest_lock
-@section @code{omp_destroy_nest_lock} -- Destroy nested lock
-@table @asis
-@item @emph{Description}:
-Destroy a nested lock. In order to be destroyed, a nested lock must be
-in the unlocked state and its nesting count must equal zero.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{void omp_destroy_nest_lock(omp_nest_lock_t *);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{subroutine omp_destroy_nest_lock(nvar)}
-@item @tab @code{integer(omp_nest_lock_kind), intent(inout) :: nvar}
-@end multitable
-
-@item @emph{See also}:
-@ref{omp_init_lock}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.3.3.
-@end table
-
-
-
-@node omp_get_wtick
-@section @code{omp_get_wtick} -- Get timer precision
-@table @asis
-@item @emph{Description}:
-Gets the timer precision, i.e., the number of seconds between two
-successive clock ticks.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{double omp_get_wtick(void);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{double precision function omp_get_wtick()}
-@end multitable
-
-@item @emph{See also}:
-@ref{omp_get_wtime}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.4.2.
-@end table
-
-
-
-@node omp_get_wtime
-@section @code{omp_get_wtime} -- Elapsed wall clock time
-@table @asis
-@item @emph{Description}:
-Elapsed wall clock time in seconds. The time is measured per thread, no
-guarantee can be made that two distinct threads measure the same time.
-Time is measured from some "time in the past", which is an arbitrary time
-guaranteed not to change during the execution of the program.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{double omp_get_wtime(void);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{double precision function omp_get_wtime()}
-@end multitable
-
-@item @emph{See also}:
-@ref{omp_get_wtick}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 3.4.1.
-@end table
-
-
-
-@node omp_fulfill_event
-@section @code{omp_fulfill_event} -- Fulfill and destroy an OpenMP event
-@table @asis
-@item @emph{Description}:
-Fulfill the event associated with the event handle argument. Currently, it
-is only used to fulfill events generated by detach clauses on task
-constructs - the effect of fulfilling the event is to allow the task to
-complete.
-
-The result of calling @code{omp_fulfill_event} with an event handle other
-than that generated by a detach clause is undefined. Calling it with an
-event handle that has already been fulfilled is also undefined.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{void omp_fulfill_event(omp_event_handle_t event);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{subroutine omp_fulfill_event(event)}
-@item @tab @code{integer (kind=omp_event_handle_kind) :: event}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v5.0}, Section 3.5.1.
-@end table
-
-
-
-@c ---------------------------------------------------------------------
-@c OpenMP Environment Variables
-@c ---------------------------------------------------------------------
-
-@node Environment Variables
-@chapter OpenMP Environment Variables
-
-The environment variables which beginning with @env{OMP_} are defined by
-section 4 of the OpenMP specification in version 4.5, while those
-beginning with @env{GOMP_} are GNU extensions.
-
-@menu
-* OMP_CANCELLATION:: Set whether cancellation is activated
-* OMP_DISPLAY_ENV:: Show OpenMP version and environment variables
-* OMP_DEFAULT_DEVICE:: Set the device used in target regions
-* OMP_DYNAMIC:: Dynamic adjustment of threads
-* OMP_MAX_ACTIVE_LEVELS:: Set the maximum number of nested parallel regions
-* OMP_MAX_TASK_PRIORITY:: Set the maximum task priority value
-* OMP_NESTED:: Nested parallel regions
-* OMP_NUM_TEAMS:: Specifies the number of teams to use by teams region
-* OMP_NUM_THREADS:: Specifies the number of threads to use
-* OMP_PROC_BIND:: Whether theads may be moved between CPUs
-* OMP_PLACES:: Specifies on which CPUs the theads should be placed
-* OMP_STACKSIZE:: Set default thread stack size
-* OMP_SCHEDULE:: How threads are scheduled
-* OMP_TARGET_OFFLOAD:: Controls offloading behaviour
-* OMP_TEAMS_THREAD_LIMIT:: Set the maximum number of threads imposed by teams
-* OMP_THREAD_LIMIT:: Set the maximum number of threads
-* OMP_WAIT_POLICY:: How waiting threads are handled
-* GOMP_CPU_AFFINITY:: Bind threads to specific CPUs
-* GOMP_DEBUG:: Enable debugging output
-* GOMP_STACKSIZE:: Set default thread stack size
-* GOMP_SPINCOUNT:: Set the busy-wait spin count
-* GOMP_RTEMS_THREAD_POOLS:: Set the RTEMS specific thread pools
-@end menu
-
-
-@node OMP_CANCELLATION
-@section @env{OMP_CANCELLATION} -- Set whether cancellation is activated
-@cindex Environment Variable
-@table @asis
-@item @emph{Description}:
-If set to @code{TRUE}, the cancellation is activated. If set to @code{FALSE} or
-if unset, cancellation is disabled and the @code{cancel} construct is ignored.
-
-@item @emph{See also}:
-@ref{omp_get_cancellation}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.11
-@end table
-
-
-
-@node OMP_DISPLAY_ENV
-@section @env{OMP_DISPLAY_ENV} -- Show OpenMP version and environment variables
-@cindex Environment Variable
-@table @asis
-@item @emph{Description}:
-If set to @code{TRUE}, the OpenMP version number and the values
-associated with the OpenMP environment variables are printed to @code{stderr}.
-If set to @code{VERBOSE}, it additionally shows the value of the environment
-variables which are GNU extensions. If undefined or set to @code{FALSE},
-this information will not be shown.
-
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.12
-@end table
-
-
-
-@node OMP_DEFAULT_DEVICE
-@section @env{OMP_DEFAULT_DEVICE} -- Set the device used in target regions
-@cindex Environment Variable
-@table @asis
-@item @emph{Description}:
-Set to choose the device which is used in a @code{target} region, unless the
-value is overridden by @code{omp_set_default_device} or by a @code{device}
-clause. The value shall be the nonnegative device number. If no device with
-the given device number exists, the code is executed on the host. If unset,
-device number 0 will be used.
-
-
-@item @emph{See also}:
-@ref{omp_get_default_device}, @ref{omp_set_default_device},
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.13
-@end table
-
-
-
-@node OMP_DYNAMIC
-@section @env{OMP_DYNAMIC} -- Dynamic adjustment of threads
-@cindex Environment Variable
-@table @asis
-@item @emph{Description}:
-Enable or disable the dynamic adjustment of the number of threads
-within a team. The value of this environment variable shall be
-@code{TRUE} or @code{FALSE}. If undefined, dynamic adjustment is
-disabled by default.
-
-@item @emph{See also}:
-@ref{omp_set_dynamic}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.3
-@end table
-
-
-
-@node OMP_MAX_ACTIVE_LEVELS
-@section @env{OMP_MAX_ACTIVE_LEVELS} -- Set the maximum number of nested parallel regions
-@cindex Environment Variable
-@table @asis
-@item @emph{Description}:
-Specifies the initial value for the maximum number of nested parallel
-regions. The value of this variable shall be a positive integer.
-If undefined, then if @env{OMP_NESTED} is defined and set to true, or
-if @env{OMP_NUM_THREADS} or @env{OMP_PROC_BIND} are defined and set to
-a list with more than one item, the maximum number of nested parallel
-regions will be initialized to the largest number supported, otherwise
-it will be set to one.
-
-@item @emph{See also}:
-@ref{omp_set_max_active_levels}, @ref{OMP_NESTED}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.9
-@end table
-
-
-
-@node OMP_MAX_TASK_PRIORITY
-@section @env{OMP_MAX_TASK_PRIORITY} -- Set the maximum priority
-number that can be set for a task.
-@cindex Environment Variable
-@table @asis
-@item @emph{Description}:
-Specifies the initial value for the maximum priority value that can be
-set for a task. The value of this variable shall be a non-negative
-integer, and zero is allowed. If undefined, the default priority is
-0.
-
-@item @emph{See also}:
-@ref{omp_get_max_task_priority}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.14
-@end table
-
-
-
-@node OMP_NESTED
-@section @env{OMP_NESTED} -- Nested parallel regions
-@cindex Environment Variable
-@cindex Implementation specific setting
-@table @asis
-@item @emph{Description}:
-Enable or disable nested parallel regions, i.e., whether team members
-are allowed to create new teams. The value of this environment variable
-shall be @code{TRUE} or @code{FALSE}. If set to @code{TRUE}, the number
-of maximum active nested regions supported will by default be set to the
-maximum supported, otherwise it will be set to one. If
-@env{OMP_MAX_ACTIVE_LEVELS} is defined, its setting will override this
-setting. If both are undefined, nested parallel regions are enabled if
-@env{OMP_NUM_THREADS} or @env{OMP_PROC_BINDS} are defined to a list with
-more than one item, otherwise they are disabled by default.
-
-@item @emph{See also}:
-@ref{omp_set_max_active_levels}, @ref{omp_set_nested}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.6
-@end table
-
-
-
-@node OMP_NUM_TEAMS
-@section @env{OMP_NUM_TEAMS} -- Specifies the number of teams to use by teams region
-@cindex Environment Variable
-@table @asis
-@item @emph{Description}:
-Specifies the upper bound for number of teams to use in teams regions
-without explicit @code{num_teams} clause. The value of this variable shall
-be a positive integer. If undefined it defaults to 0 which means
-implementation defined upper bound.
-
-@item @emph{See also}:
-@ref{omp_set_num_teams}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 6.23
-@end table
-
-
-
-@node OMP_NUM_THREADS
-@section @env{OMP_NUM_THREADS} -- Specifies the number of threads to use
-@cindex Environment Variable
-@cindex Implementation specific setting
-@table @asis
-@item @emph{Description}:
-Specifies the default number of threads to use in parallel regions. The
-value of this variable shall be a comma-separated list of positive integers;
-the value specifies the number of threads to use for the corresponding nested
-level. Specifying more than one item in the list will automatically enable
-nesting by default. If undefined one thread per CPU is used.
-
-@item @emph{See also}:
-@ref{omp_set_num_threads}, @ref{OMP_NESTED}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.2
-@end table
-
-
-
-@node OMP_PROC_BIND
-@section @env{OMP_PROC_BIND} -- Whether theads may be moved between CPUs
-@cindex Environment Variable
-@table @asis
-@item @emph{Description}:
-Specifies whether threads may be moved between processors. If set to
-@code{TRUE}, OpenMP theads should not be moved; if set to @code{FALSE}
-they may be moved. Alternatively, a comma separated list with the
-values @code{PRIMARY}, @code{MASTER}, @code{CLOSE} and @code{SPREAD} can
-be used to specify the thread affinity policy for the corresponding nesting
-level. With @code{PRIMARY} and @code{MASTER} the worker threads are in the
-same place partition as the primary thread. With @code{CLOSE} those are
-kept close to the primary thread in contiguous place partitions. And
-with @code{SPREAD} a sparse distribution
-across the place partitions is used. Specifying more than one item in the
-list will automatically enable nesting by default.
-
-When undefined, @env{OMP_PROC_BIND} defaults to @code{TRUE} when
-@env{OMP_PLACES} or @env{GOMP_CPU_AFFINITY} is set and @code{FALSE} otherwise.
-
-@item @emph{See also}:
-@ref{omp_get_proc_bind}, @ref{GOMP_CPU_AFFINITY},
-@ref{OMP_NESTED}, @ref{OMP_PLACES}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.4
-@end table
-
-
-
-@node OMP_PLACES
-@section @env{OMP_PLACES} -- Specifies on which CPUs the theads should be placed
-@cindex Environment Variable
-@table @asis
-@item @emph{Description}:
-The thread placement can be either specified using an abstract name or by an
-explicit list of the places. The abstract names @code{threads}, @code{cores},
-@code{sockets}, @code{ll_caches} and @code{numa_domains} can be optionally
-followed by a positive number in parentheses, which denotes the how many places
-shall be created. With @code{threads} each place corresponds to a single
-hardware thread; @code{cores} to a single core with the corresponding number of
-hardware threads; with @code{sockets} the place corresponds to a single
-socket; with @code{ll_caches} to a set of cores that shares the last level
-cache on the device; and @code{numa_domains} to a set of cores for which their
-closest memory on the device is the same memory and at a similar distance from
-the cores. The resulting placement can be shown by setting the
-@env{OMP_DISPLAY_ENV} environment variable.
-
-Alternatively, the placement can be specified explicitly as comma-separated
-list of places. A place is specified by set of nonnegative numbers in curly
-braces, denoting the hardware threads. The curly braces can be omitted
-when only a single number has been specified. The hardware threads
-belonging to a place can either be specified as comma-separated list of
-nonnegative thread numbers or using an interval. Multiple places can also be
-either specified by a comma-separated list of places or by an interval. To
-specify an interval, a colon followed by the count is placed after
-the hardware thread number or the place. Optionally, the length can be
-followed by a colon and the stride number -- otherwise a unit stride is
-assumed. Placing an exclamation mark (@code{!}) directly before a curly
-brace or numbers inside the curly braces (excluding intervals) will
-exclude those hardware threads.
-
-For instance, the following specifies the same places list:
-@code{"@{0,1,2@}, @{3,4,6@}, @{7,8,9@}, @{10,11,12@}"};
-@code{"@{0:3@}, @{3:3@}, @{7:3@}, @{10:3@}"}; and @code{"@{0:2@}:4:3"}.
-
-If @env{OMP_PLACES} and @env{GOMP_CPU_AFFINITY} are unset and
-@env{OMP_PROC_BIND} is either unset or @code{false}, threads may be moved
-between CPUs following no placement policy.
-
-@item @emph{See also}:
-@ref{OMP_PROC_BIND}, @ref{GOMP_CPU_AFFINITY}, @ref{omp_get_proc_bind},
-@ref{OMP_DISPLAY_ENV}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.5
-@end table
-
-
-
-@node OMP_STACKSIZE
-@section @env{OMP_STACKSIZE} -- Set default thread stack size
-@cindex Environment Variable
-@table @asis
-@item @emph{Description}:
-Set the default thread stack size in kilobytes, unless the number
-is suffixed by @code{B}, @code{K}, @code{M} or @code{G}, in which
-case the size is, respectively, in bytes, kilobytes, megabytes
-or gigabytes. This is different from @code{pthread_attr_setstacksize}
-which gets the number of bytes as an argument. If the stack size cannot
-be set due to system constraints, an error is reported and the initial
-stack size is left unchanged. If undefined, the stack size is system
-dependent.
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.7
-@end table
-
-
-
-@node OMP_SCHEDULE
-@section @env{OMP_SCHEDULE} -- How threads are scheduled
-@cindex Environment Variable
-@cindex Implementation specific setting
-@table @asis
-@item @emph{Description}:
-Allows to specify @code{schedule type} and @code{chunk size}.
-The value of the variable shall have the form: @code{type[,chunk]} where
-@code{type} is one of @code{static}, @code{dynamic}, @code{guided} or @code{auto}
-The optional @code{chunk} size shall be a positive integer. If undefined,
-dynamic scheduling and a chunk size of 1 is used.
-
-@item @emph{See also}:
-@ref{omp_set_schedule}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Sections 2.7.1.1 and 4.1
-@end table
-
-
-
-@node OMP_TARGET_OFFLOAD
-@section @env{OMP_TARGET_OFFLOAD} -- Controls offloading behaviour
-@cindex Environment Variable
-@cindex Implementation specific setting
-@table @asis
-@item @emph{Description}:
-Specifies the behaviour with regard to offloading code to a device. This
-variable can be set to one of three values - @code{MANDATORY}, @code{DISABLED}
-or @code{DEFAULT}.
-
-If set to @code{MANDATORY}, the program will terminate with an error if
-the offload device is not present or is not supported. If set to
-@code{DISABLED}, then offloading is disabled and all code will run on the
-host. If set to @code{DEFAULT}, the program will try offloading to the
-device first, then fall back to running code on the host if it cannot.
-
-If undefined, then the program will behave as if @code{DEFAULT} was set.
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v5.0}, Section 6.17
-@end table
-
-
-
-@node OMP_TEAMS_THREAD_LIMIT
-@section @env{OMP_TEAMS_THREAD_LIMIT} -- Set the maximum number of threads imposed by teams
-@cindex Environment Variable
-@table @asis
-@item @emph{Description}:
-Specifies an upper bound for the number of threads to use by each contention
-group created by a teams construct without explicit @code{thread_limit}
-clause. The value of this variable shall be a positive integer. If undefined,
-the value of 0 is used which stands for an implementation defined upper
-limit.
-
-@item @emph{See also}:
-@ref{OMP_THREAD_LIMIT}, @ref{omp_set_teams_thread_limit}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 6.24
-@end table
-
-
-
-@node OMP_THREAD_LIMIT
-@section @env{OMP_THREAD_LIMIT} -- Set the maximum number of threads
-@cindex Environment Variable
-@table @asis
-@item @emph{Description}:
-Specifies the number of threads to use for the whole program. The
-value of this variable shall be a positive integer. If undefined,
-the number of threads is not limited.
-
-@item @emph{See also}:
-@ref{OMP_NUM_THREADS}, @ref{omp_get_thread_limit}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.10
-@end table
-
-
-
-@node OMP_WAIT_POLICY
-@section @env{OMP_WAIT_POLICY} -- How waiting threads are handled
-@cindex Environment Variable
-@table @asis
-@item @emph{Description}:
-Specifies whether waiting threads should be active or passive. If
-the value is @code{PASSIVE}, waiting threads should not consume CPU
-power while waiting; while the value is @code{ACTIVE} specifies that
-they should. If undefined, threads wait actively for a short time
-before waiting passively.
-
-@item @emph{See also}:
-@ref{GOMP_SPINCOUNT}
-
-@item @emph{Reference}:
-@uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.8
-@end table
-
-
-
-@node GOMP_CPU_AFFINITY
-@section @env{GOMP_CPU_AFFINITY} -- Bind threads to specific CPUs
-@cindex Environment Variable
-@table @asis
-@item @emph{Description}:
-Binds threads to specific CPUs. The variable should contain a space-separated
-or comma-separated list of CPUs. This list may contain different kinds of
-entries: either single CPU numbers in any order, a range of CPUs (M-N)
-or a range with some stride (M-N:S). CPU numbers are zero based. For example,
-@code{GOMP_CPU_AFFINITY="0 3 1-2 4-15:2"} will bind the initial thread
-to CPU 0, the second to CPU 3, the third to CPU 1, the fourth to
-CPU 2, the fifth to CPU 4, the sixth through tenth to CPUs 6, 8, 10, 12,
-and 14 respectively and then start assigning back from the beginning of
-the list. @code{GOMP_CPU_AFFINITY=0} binds all threads to CPU 0.
-
-There is no libgomp library routine to determine whether a CPU affinity
-specification is in effect. As a workaround, language-specific library
-functions, e.g., @code{getenv} in C or @code{GET_ENVIRONMENT_VARIABLE} in
-Fortran, may be used to query the setting of the @code{GOMP_CPU_AFFINITY}
-environment variable. A defined CPU affinity on startup cannot be changed
-or disabled during the runtime of the application.
-
-If both @env{GOMP_CPU_AFFINITY} and @env{OMP_PROC_BIND} are set,
-@env{OMP_PROC_BIND} has a higher precedence. If neither has been set and
-@env{OMP_PROC_BIND} is unset, or when @env{OMP_PROC_BIND} is set to
-@code{FALSE}, the host system will handle the assignment of threads to CPUs.
-
-@item @emph{See also}:
-@ref{OMP_PLACES}, @ref{OMP_PROC_BIND}
-@end table
-
-
-
-@node GOMP_DEBUG
-@section @env{GOMP_DEBUG} -- Enable debugging output
-@cindex Environment Variable
-@table @asis
-@item @emph{Description}:
-Enable debugging output. The variable should be set to @code{0}
-(disabled, also the default if not set), or @code{1} (enabled).
-
-If enabled, some debugging output will be printed during execution.
-This is currently not specified in more detail, and subject to change.
-@end table
-
-
-
-@node GOMP_STACKSIZE
-@section @env{GOMP_STACKSIZE} -- Set default thread stack size
-@cindex Environment Variable
-@cindex Implementation specific setting
-@table @asis
-@item @emph{Description}:
-Set the default thread stack size in kilobytes. This is different from
-@code{pthread_attr_setstacksize} which gets the number of bytes as an
-argument. If the stack size cannot be set due to system constraints, an
-error is reported and the initial stack size is left unchanged. If undefined,
-the stack size is system dependent.
-
-@item @emph{See also}:
-@ref{OMP_STACKSIZE}
-
-@item @emph{Reference}:
-@uref{https://gcc.gnu.org/ml/gcc-patches/2006-06/msg00493.html,
-GCC Patches Mailinglist},
-@uref{https://gcc.gnu.org/ml/gcc-patches/2006-06/msg00496.html,
-GCC Patches Mailinglist}
-@end table
-
-
-
-@node GOMP_SPINCOUNT
-@section @env{GOMP_SPINCOUNT} -- Set the busy-wait spin count
-@cindex Environment Variable
-@cindex Implementation specific setting
-@table @asis
-@item @emph{Description}:
-Determines how long a threads waits actively with consuming CPU power
-before waiting passively without consuming CPU power. The value may be
-either @code{INFINITE}, @code{INFINITY} to always wait actively or an
-integer which gives the number of spins of the busy-wait loop. The
-integer may optionally be followed by the following suffixes acting
-as multiplication factors: @code{k} (kilo, thousand), @code{M} (mega,
-million), @code{G} (giga, billion), or @code{T} (tera, trillion).
-If undefined, 0 is used when @env{OMP_WAIT_POLICY} is @code{PASSIVE},
-300,000 is used when @env{OMP_WAIT_POLICY} is undefined and
-30 billion is used when @env{OMP_WAIT_POLICY} is @code{ACTIVE}.
-If there are more OpenMP threads than available CPUs, 1000 and 100
-spins are used for @env{OMP_WAIT_POLICY} being @code{ACTIVE} or
-undefined, respectively; unless the @env{GOMP_SPINCOUNT} is lower
-or @env{OMP_WAIT_POLICY} is @code{PASSIVE}.
-
-@item @emph{See also}:
-@ref{OMP_WAIT_POLICY}
-@end table
-
-
-
-@node GOMP_RTEMS_THREAD_POOLS
-@section @env{GOMP_RTEMS_THREAD_POOLS} -- Set the RTEMS specific thread pools
-@cindex Environment Variable
-@cindex Implementation specific setting
-@table @asis
-@item @emph{Description}:
-This environment variable is only used on the RTEMS real-time operating system.
-It determines the scheduler instance specific thread pools. The format for
-@env{GOMP_RTEMS_THREAD_POOLS} is a list of optional
-@code{<thread-pool-count>[$<priority>]@@<scheduler-name>} configurations
-separated by @code{:} where:
-@itemize @bullet
-@item @code{<thread-pool-count>} is the thread pool count for this scheduler
-instance.
-@item @code{$<priority>} is an optional priority for the worker threads of a
-thread pool according to @code{pthread_setschedparam}. In case a priority
-value is omitted, then a worker thread will inherit the priority of the OpenMP
-primary thread that created it. The priority of the worker thread is not
-changed after creation, even if a new OpenMP primary thread using the worker has
-a different priority.
-@item @code{@@<scheduler-name>} is the scheduler instance name according to the
-RTEMS application configuration.
-@end itemize
-In case no thread pool configuration is specified for a scheduler instance,
-then each OpenMP primary thread of this scheduler instance will use its own
-dynamically allocated thread pool. To limit the worker thread count of the
-thread pools, each OpenMP primary thread must call @code{omp_set_num_threads}.
-@item @emph{Example}:
-Lets suppose we have three scheduler instances @code{IO}, @code{WRK0}, and
-@code{WRK1} with @env{GOMP_RTEMS_THREAD_POOLS} set to
-@code{"1@@WRK0:3$4@@WRK1"}. Then there are no thread pool restrictions for
-scheduler instance @code{IO}. In the scheduler instance @code{WRK0} there is
-one thread pool available. Since no priority is specified for this scheduler
-instance, the worker thread inherits the priority of the OpenMP primary thread
-that created it. In the scheduler instance @code{WRK1} there are three thread
-pools available and their worker threads run at priority four.
-@end table
-
-
-
-@c ---------------------------------------------------------------------
-@c Enabling OpenACC
-@c ---------------------------------------------------------------------
-
-@node Enabling OpenACC
-@chapter Enabling OpenACC
-
-To activate the OpenACC extensions for C/C++ and Fortran, the compile-time
-flag @option{-fopenacc} must be specified. This enables the OpenACC directive
-@code{#pragma acc} in C/C++ and @code{!$acc} directives in free form,
-@code{c$acc}, @code{*$acc} and @code{!$acc} directives in fixed form,
-@code{!$} conditional compilation sentinels in free form and @code{c$},
-@code{*$} and @code{!$} sentinels in fixed form, for Fortran. The flag also
-arranges for automatic linking of the OpenACC runtime library
-(@ref{OpenACC Runtime Library Routines}).
-
-See @uref{https://gcc.gnu.org/wiki/OpenACC} for more information.
-
-A complete description of all OpenACC directives accepted may be found in
-the @uref{https://www.openacc.org, OpenACC} Application Programming
-Interface manual, version 2.6.
-
-
-
-@c ---------------------------------------------------------------------
-@c OpenACC Runtime Library Routines
-@c ---------------------------------------------------------------------
-
-@node OpenACC Runtime Library Routines
-@chapter OpenACC Runtime Library Routines
-
-The runtime routines described here are defined by section 3 of the OpenACC
-specifications in version 2.6.
-They have C linkage, and do not throw exceptions.
-Generally, they are available only for the host, with the exception of
-@code{acc_on_device}, which is available for both the host and the
-acceleration device.
-
-@menu
-* acc_get_num_devices:: Get number of devices for the given device
- type.
-* acc_set_device_type:: Set type of device accelerator to use.
-* acc_get_device_type:: Get type of device accelerator to be used.
-* acc_set_device_num:: Set device number to use.
-* acc_get_device_num:: Get device number to be used.
-* acc_get_property:: Get device property.
-* acc_async_test:: Tests for completion of a specific asynchronous
- operation.
-* acc_async_test_all:: Tests for completion of all asynchronous
- operations.
-* acc_wait:: Wait for completion of a specific asynchronous
- operation.
-* acc_wait_all:: Waits for completion of all asynchronous
- operations.
-* acc_wait_all_async:: Wait for completion of all asynchronous
- operations.
-* acc_wait_async:: Wait for completion of asynchronous operations.
-* acc_init:: Initialize runtime for a specific device type.
-* acc_shutdown:: Shuts down the runtime for a specific device
- type.
-* acc_on_device:: Whether executing on a particular device
-* acc_malloc:: Allocate device memory.
-* acc_free:: Free device memory.
-* acc_copyin:: Allocate device memory and copy host memory to
- it.
-* acc_present_or_copyin:: If the data is not present on the device,
- allocate device memory and copy from host
- memory.
-* acc_create:: Allocate device memory and map it to host
- memory.
-* acc_present_or_create:: If the data is not present on the device,
- allocate device memory and map it to host
- memory.
-* acc_copyout:: Copy device memory to host memory.
-* acc_delete:: Free device memory.
-* acc_update_device:: Update device memory from mapped host memory.
-* acc_update_self:: Update host memory from mapped device memory.
-* acc_map_data:: Map previously allocated device memory to host
- memory.
-* acc_unmap_data:: Unmap device memory from host memory.
-* acc_deviceptr:: Get device pointer associated with specific
- host address.
-* acc_hostptr:: Get host pointer associated with specific
- device address.
-* acc_is_present:: Indicate whether host variable / array is
- present on device.
-* acc_memcpy_to_device:: Copy host memory to device memory.
-* acc_memcpy_from_device:: Copy device memory to host memory.
-* acc_attach:: Let device pointer point to device-pointer target.
-* acc_detach:: Let device pointer point to host-pointer target.
-
-API routines for target platforms.
-
-* acc_get_current_cuda_device:: Get CUDA device handle.
-* acc_get_current_cuda_context::Get CUDA context handle.
-* acc_get_cuda_stream:: Get CUDA stream handle.
-* acc_set_cuda_stream:: Set CUDA stream handle.
-
-API routines for the OpenACC Profiling Interface.
-
-* acc_prof_register:: Register callbacks.
-* acc_prof_unregister:: Unregister callbacks.
-* acc_prof_lookup:: Obtain inquiry functions.
-* acc_register_library:: Library registration.
-@end menu
-
-
-
-@node acc_get_num_devices
-@section @code{acc_get_num_devices} -- Get number of devices for given device type
-@table @asis
-@item @emph{Description}
-This function returns a value indicating the number of devices available
-for the device type specified in @var{devicetype}.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{int acc_get_num_devices(acc_device_t devicetype);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{integer function acc_get_num_devices(devicetype)}
-@item @tab @code{integer(kind=acc_device_kind) devicetype}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.6}, section
-3.2.1.
-@end table
-
-
-
-@node acc_set_device_type
-@section @code{acc_set_device_type} -- Set type of device accelerator to use.
-@table @asis
-@item @emph{Description}
-This function indicates to the runtime library which device type, specified
-in @var{devicetype}, to use when executing a parallel or kernels region.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{acc_set_device_type(acc_device_t devicetype);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{subroutine acc_set_device_type(devicetype)}
-@item @tab @code{integer(kind=acc_device_kind) devicetype}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.6}, section
-3.2.2.
-@end table
-
-
-
-@node acc_get_device_type
-@section @code{acc_get_device_type} -- Get type of device accelerator to be used.
-@table @asis
-@item @emph{Description}
-This function returns what device type will be used when executing a
-parallel or kernels region.
-
-This function returns @code{acc_device_none} if
-@code{acc_get_device_type} is called from
-@code{acc_ev_device_init_start}, @code{acc_ev_device_init_end}
-callbacks of the OpenACC Profiling Interface (@ref{OpenACC Profiling
-Interface}), that is, if the device is currently being initialized.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{acc_device_t acc_get_device_type(void);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{function acc_get_device_type(void)}
-@item @tab @code{integer(kind=acc_device_kind) acc_get_device_type}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.6}, section
-3.2.3.
-@end table
-
-
-
-@node acc_set_device_num
-@section @code{acc_set_device_num} -- Set device number to use.
-@table @asis
-@item @emph{Description}
-This function will indicate to the runtime which device number,
-specified by @var{devicenum}, associated with the specified device
-type @var{devicetype}.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{acc_set_device_num(int devicenum, acc_device_t devicetype);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{subroutine acc_set_device_num(devicenum, devicetype)}
-@item @tab @code{integer devicenum}
-@item @tab @code{integer(kind=acc_device_kind) devicetype}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.6}, section
-3.2.4.
-@end table
-
-
-
-@node acc_get_device_num
-@section @code{acc_get_device_num} -- Get device number to be used.
-@table @asis
-@item @emph{Description}
-This function returns which device number associated with the specified device
-type @var{devicetype}, will be used when executing a parallel or kernels
-region.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{int acc_get_device_num(acc_device_t devicetype);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{function acc_get_device_num(devicetype)}
-@item @tab @code{integer(kind=acc_device_kind) devicetype}
-@item @tab @code{integer acc_get_device_num}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.6}, section
-3.2.5.
-@end table
-
-
-
-@node acc_get_property
-@section @code{acc_get_property} -- Get device property.
-@cindex acc_get_property
-@cindex acc_get_property_string
-@table @asis
-@item @emph{Description}
-These routines return the value of the specified @var{property} for the
-device being queried according to @var{devicenum} and @var{devicetype}.
-Integer-valued and string-valued properties are returned by
-@code{acc_get_property} and @code{acc_get_property_string} respectively.
-The Fortran @code{acc_get_property_string} subroutine returns the string
-retrieved in its fourth argument while the remaining entry points are
-functions, which pass the return value as their result.
-
-Note for Fortran, only: the OpenACC technical committee corrected and, hence,
-modified the interface introduced in OpenACC 2.6. The kind-value parameter
-@code{acc_device_property} has been renamed to @code{acc_device_property_kind}
-for consistency and the return type of the @code{acc_get_property} function is
-now a @code{c_size_t} integer instead of a @code{acc_device_property} integer.
-The parameter @code{acc_device_property} will continue to be provided,
-but might be removed in a future version of GCC.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{size_t acc_get_property(int devicenum, acc_device_t devicetype, acc_device_property_t property);}
-@item @emph{Prototype}: @tab @code{const char *acc_get_property_string(int devicenum, acc_device_t devicetype, acc_device_property_t property);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{function acc_get_property(devicenum, devicetype, property)}
-@item @emph{Interface}: @tab @code{subroutine acc_get_property_string(devicenum, devicetype, property, string)}
-@item @tab @code{use ISO_C_Binding, only: c_size_t}
-@item @tab @code{integer devicenum}
-@item @tab @code{integer(kind=acc_device_kind) devicetype}
-@item @tab @code{integer(kind=acc_device_property_kind) property}
-@item @tab @code{integer(kind=c_size_t) acc_get_property}
-@item @tab @code{character(*) string}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.6}, section
-3.2.6.
-@end table
-
-
-
-@node acc_async_test
-@section @code{acc_async_test} -- Test for completion of a specific asynchronous operation.
-@table @asis
-@item @emph{Description}
-This function tests for completion of the asynchronous operation specified
-in @var{arg}. In C/C++, a non-zero value will be returned to indicate
-the specified asynchronous operation has completed. While Fortran will return
-a @code{true}. If the asynchronous operation has not completed, C/C++ returns
-a zero and Fortran returns a @code{false}.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{int acc_async_test(int arg);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{function acc_async_test(arg)}
-@item @tab @code{integer(kind=acc_handle_kind) arg}
-@item @tab @code{logical acc_async_test}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.6}, section
-3.2.9.
-@end table
-
-
-
-@node acc_async_test_all
-@section @code{acc_async_test_all} -- Tests for completion of all asynchronous operations.
-@table @asis
-@item @emph{Description}
-This function tests for completion of all asynchronous operations.
-In C/C++, a non-zero value will be returned to indicate all asynchronous
-operations have completed. While Fortran will return a @code{true}. If
-any asynchronous operation has not completed, C/C++ returns a zero and
-Fortran returns a @code{false}.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{int acc_async_test_all(void);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{function acc_async_test()}
-@item @tab @code{logical acc_get_device_num}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.6}, section
-3.2.10.
-@end table
-
-
-
-@node acc_wait
-@section @code{acc_wait} -- Wait for completion of a specific asynchronous operation.
-@table @asis
-@item @emph{Description}
-This function waits for completion of the asynchronous operation
-specified in @var{arg}.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{acc_wait(arg);}
-@item @emph{Prototype (OpenACC 1.0 compatibility)}: @tab @code{acc_async_wait(arg);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{subroutine acc_wait(arg)}
-@item @tab @code{integer(acc_handle_kind) arg}
-@item @emph{Interface (OpenACC 1.0 compatibility)}: @tab @code{subroutine acc_async_wait(arg)}
-@item @tab @code{integer(acc_handle_kind) arg}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.6}, section
-3.2.11.
-@end table
-
-
-
-@node acc_wait_all
-@section @code{acc_wait_all} -- Waits for completion of all asynchronous operations.
-@table @asis
-@item @emph{Description}
-This function waits for the completion of all asynchronous operations.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{acc_wait_all(void);}
-@item @emph{Prototype (OpenACC 1.0 compatibility)}: @tab @code{acc_async_wait_all(void);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{subroutine acc_wait_all()}
-@item @emph{Interface (OpenACC 1.0 compatibility)}: @tab @code{subroutine acc_async_wait_all()}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.6}, section
-3.2.13.
-@end table
-
-
-
-@node acc_wait_all_async
-@section @code{acc_wait_all_async} -- Wait for completion of all asynchronous operations.
-@table @asis
-@item @emph{Description}
-This function enqueues a wait operation on the queue @var{async} for any
-and all asynchronous operations that have been previously enqueued on
-any queue.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{acc_wait_all_async(int async);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{subroutine acc_wait_all_async(async)}
-@item @tab @code{integer(acc_handle_kind) async}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.6}, section
-3.2.14.
-@end table
-
-
-
-@node acc_wait_async
-@section @code{acc_wait_async} -- Wait for completion of asynchronous operations.
-@table @asis
-@item @emph{Description}
-This function enqueues a wait operation on queue @var{async} for any and all
-asynchronous operations enqueued on queue @var{arg}.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{acc_wait_async(int arg, int async);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{subroutine acc_wait_async(arg, async)}
-@item @tab @code{integer(acc_handle_kind) arg, async}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.6}, section
-3.2.12.
-@end table
-
-
-
-@node acc_init
-@section @code{acc_init} -- Initialize runtime for a specific device type.
-@table @asis
-@item @emph{Description}
-This function initializes the runtime for the device type specified in
-@var{devicetype}.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{acc_init(acc_device_t devicetype);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{subroutine acc_init(devicetype)}
-@item @tab @code{integer(acc_device_kind) devicetype}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.6}, section
-3.2.7.
-@end table
-
-
-
-@node acc_shutdown
-@section @code{acc_shutdown} -- Shuts down the runtime for a specific device type.
-@table @asis
-@item @emph{Description}
-This function shuts down the runtime for the device type specified in
-@var{devicetype}.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{acc_shutdown(acc_device_t devicetype);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{subroutine acc_shutdown(devicetype)}
-@item @tab @code{integer(acc_device_kind) devicetype}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.6}, section
-3.2.8.
-@end table
-
-
-
-@node acc_on_device
-@section @code{acc_on_device} -- Whether executing on a particular device
-@table @asis
-@item @emph{Description}:
-This function returns whether the program is executing on a particular
-device specified in @var{devicetype}. In C/C++ a non-zero value is
-returned to indicate the device is executing on the specified device type.
-In Fortran, @code{true} will be returned. If the program is not executing
-on the specified device type C/C++ will return a zero, while Fortran will
-return @code{false}.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{acc_on_device(acc_device_t devicetype);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{function acc_on_device(devicetype)}
-@item @tab @code{integer(acc_device_kind) devicetype}
-@item @tab @code{logical acc_on_device}
-@end multitable
-
-
-@item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.6}, section
-3.2.17.
-@end table
-
-
-
-@node acc_malloc
-@section @code{acc_malloc} -- Allocate device memory.
-@table @asis
-@item @emph{Description}
-This function allocates @var{len} bytes of device memory. It returns
-the device address of the allocated memory.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{d_void* acc_malloc(size_t len);}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.6}, section
-3.2.18.
-@end table
-
-
-
-@node acc_free
-@section @code{acc_free} -- Free device memory.
-@table @asis
-@item @emph{Description}
-Free previously allocated device memory at the device address @code{a}.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{acc_free(d_void *a);}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.6}, section
-3.2.19.
-@end table
-
-
-
-@node acc_copyin
-@section @code{acc_copyin} -- Allocate device memory and copy host memory to it.
-@table @asis
-@item @emph{Description}
-In C/C++, this function allocates @var{len} bytes of device memory
-and maps it to the specified host address in @var{a}. The device
-address of the newly allocated device memory is returned.
-
-In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
-a contiguous array section. The second form @var{a} specifies a
-variable or array element and @var{len} specifies the length in bytes.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{void *acc_copyin(h_void *a, size_t len);}
-@item @emph{Prototype}: @tab @code{void *acc_copyin_async(h_void *a, size_t len, int async);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{subroutine acc_copyin(a)}
-@item @tab @code{type, dimension(:[,:]...) :: a}
-@item @emph{Interface}: @tab @code{subroutine acc_copyin(a, len)}
-@item @tab @code{type, dimension(:[,:]...) :: a}
-@item @tab @code{integer len}
-@item @emph{Interface}: @tab @code{subroutine acc_copyin_async(a, async)}
-@item @tab @code{type, dimension(:[,:]...) :: a}
-@item @tab @code{integer(acc_handle_kind) :: async}
-@item @emph{Interface}: @tab @code{subroutine acc_copyin_async(a, len, async)}
-@item @tab @code{type, dimension(:[,:]...) :: a}
-@item @tab @code{integer len}
-@item @tab @code{integer(acc_handle_kind) :: async}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.6}, section
-3.2.20.
-@end table
-
-
-
-@node acc_present_or_copyin
-@section @code{acc_present_or_copyin} -- If the data is not present on the device, allocate device memory and copy from host memory.
-@table @asis
-@item @emph{Description}
-This function tests if the host data specified by @var{a} and of length
-@var{len} is present or not. If it is not present, then device memory
-will be allocated and the host memory copied. The device address of
-the newly allocated device memory is returned.
-
-In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
-a contiguous array section. The second form @var{a} specifies a variable or
-array element and @var{len} specifies the length in bytes.
-
-Note that @code{acc_present_or_copyin} and @code{acc_pcopyin} exist for
-backward compatibility with OpenACC 2.0; use @ref{acc_copyin} instead.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{void *acc_present_or_copyin(h_void *a, size_t len);}
-@item @emph{Prototype}: @tab @code{void *acc_pcopyin(h_void *a, size_t len);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{subroutine acc_present_or_copyin(a)}
-@item @tab @code{type, dimension(:[,:]...) :: a}
-@item @emph{Interface}: @tab @code{subroutine acc_present_or_copyin(a, len)}
-@item @tab @code{type, dimension(:[,:]...) :: a}
-@item @tab @code{integer len}
-@item @emph{Interface}: @tab @code{subroutine acc_pcopyin(a)}
-@item @tab @code{type, dimension(:[,:]...) :: a}
-@item @emph{Interface}: @tab @code{subroutine acc_pcopyin(a, len)}
-@item @tab @code{type, dimension(:[,:]...) :: a}
-@item @tab @code{integer len}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.6}, section
-3.2.20.
-@end table
-
-
-
-@node acc_create
-@section @code{acc_create} -- Allocate device memory and map it to host memory.
-@table @asis
-@item @emph{Description}
-This function allocates device memory and maps it to host memory specified
-by the host address @var{a} with a length of @var{len} bytes. In C/C++,
-the function returns the device address of the allocated device memory.
-
-In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
-a contiguous array section. The second form @var{a} specifies a variable or
-array element and @var{len} specifies the length in bytes.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{void *acc_create(h_void *a, size_t len);}
-@item @emph{Prototype}: @tab @code{void *acc_create_async(h_void *a, size_t len, int async);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{subroutine acc_create(a)}
-@item @tab @code{type, dimension(:[,:]...) :: a}
-@item @emph{Interface}: @tab @code{subroutine acc_create(a, len)}
-@item @tab @code{type, dimension(:[,:]...) :: a}
-@item @tab @code{integer len}
-@item @emph{Interface}: @tab @code{subroutine acc_create_async(a, async)}
-@item @tab @code{type, dimension(:[,:]...) :: a}
-@item @tab @code{integer(acc_handle_kind) :: async}
-@item @emph{Interface}: @tab @code{subroutine acc_create_async(a, len, async)}
-@item @tab @code{type, dimension(:[,:]...) :: a}
-@item @tab @code{integer len}
-@item @tab @code{integer(acc_handle_kind) :: async}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.6}, section
-3.2.21.
-@end table
-
-
-
-@node acc_present_or_create
-@section @code{acc_present_or_create} -- If the data is not present on the device, allocate device memory and map it to host memory.
-@table @asis
-@item @emph{Description}
-This function tests if the host data specified by @var{a} and of length
-@var{len} is present or not. If it is not present, then device memory
-will be allocated and mapped to host memory. In C/C++, the device address
-of the newly allocated device memory is returned.
-
-In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
-a contiguous array section. The second form @var{a} specifies a variable or
-array element and @var{len} specifies the length in bytes.
-
-Note that @code{acc_present_or_create} and @code{acc_pcreate} exist for
-backward compatibility with OpenACC 2.0; use @ref{acc_create} instead.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{void *acc_present_or_create(h_void *a, size_t len)}
-@item @emph{Prototype}: @tab @code{void *acc_pcreate(h_void *a, size_t len)}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{subroutine acc_present_or_create(a)}
-@item @tab @code{type, dimension(:[,:]...) :: a}
-@item @emph{Interface}: @tab @code{subroutine acc_present_or_create(a, len)}
-@item @tab @code{type, dimension(:[,:]...) :: a}
-@item @tab @code{integer len}
-@item @emph{Interface}: @tab @code{subroutine acc_pcreate(a)}
-@item @tab @code{type, dimension(:[,:]...) :: a}
-@item @emph{Interface}: @tab @code{subroutine acc_pcreate(a, len)}
-@item @tab @code{type, dimension(:[,:]...) :: a}
-@item @tab @code{integer len}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.6}, section
-3.2.21.
-@end table
-
-
-
-@node acc_copyout
-@section @code{acc_copyout} -- Copy device memory to host memory.
-@table @asis
-@item @emph{Description}
-This function copies mapped device memory to host memory which is specified
-by host address @var{a} for a length @var{len} bytes in C/C++.
-
-In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
-a contiguous array section. The second form @var{a} specifies a variable or
-array element and @var{len} specifies the length in bytes.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{acc_copyout(h_void *a, size_t len);}
-@item @emph{Prototype}: @tab @code{acc_copyout_async(h_void *a, size_t len, int async);}
-@item @emph{Prototype}: @tab @code{acc_copyout_finalize(h_void *a, size_t len);}
-@item @emph{Prototype}: @tab @code{acc_copyout_finalize_async(h_void *a, size_t len, int async);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{subroutine acc_copyout(a)}
-@item @tab @code{type, dimension(:[,:]...) :: a}
-@item @emph{Interface}: @tab @code{subroutine acc_copyout(a, len)}
-@item @tab @code{type, dimension(:[,:]...) :: a}
-@item @tab @code{integer len}
-@item @emph{Interface}: @tab @code{subroutine acc_copyout_async(a, async)}
-@item @tab @code{type, dimension(:[,:]...) :: a}
-@item @tab @code{integer(acc_handle_kind) :: async}
-@item @emph{Interface}: @tab @code{subroutine acc_copyout_async(a, len, async)}
-@item @tab @code{type, dimension(:[,:]...) :: a}
-@item @tab @code{integer len}
-@item @tab @code{integer(acc_handle_kind) :: async}
-@item @emph{Interface}: @tab @code{subroutine acc_copyout_finalize(a)}
-@item @tab @code{type, dimension(:[,:]...) :: a}
-@item @emph{Interface}: @tab @code{subroutine acc_copyout_finalize(a, len)}
-@item @tab @code{type, dimension(:[,:]...) :: a}
-@item @tab @code{integer len}
-@item @emph{Interface}: @tab @code{subroutine acc_copyout_finalize_async(a, async)}
-@item @tab @code{type, dimension(:[,:]...) :: a}
-@item @tab @code{integer(acc_handle_kind) :: async}
-@item @emph{Interface}: @tab @code{subroutine acc_copyout_finalize_async(a, len, async)}
-@item @tab @code{type, dimension(:[,:]...) :: a}
-@item @tab @code{integer len}
-@item @tab @code{integer(acc_handle_kind) :: async}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.6}, section
-3.2.22.
-@end table
-
-
-
-@node acc_delete
-@section @code{acc_delete} -- Free device memory.
-@table @asis
-@item @emph{Description}
-This function frees previously allocated device memory specified by
-the device address @var{a} and the length of @var{len} bytes.
-
-In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
-a contiguous array section. The second form @var{a} specifies a variable or
-array element and @var{len} specifies the length in bytes.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{acc_delete(h_void *a, size_t len);}
-@item @emph{Prototype}: @tab @code{acc_delete_async(h_void *a, size_t len, int async);}
-@item @emph{Prototype}: @tab @code{acc_delete_finalize(h_void *a, size_t len);}
-@item @emph{Prototype}: @tab @code{acc_delete_finalize_async(h_void *a, size_t len, int async);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{subroutine acc_delete(a)}
-@item @tab @code{type, dimension(:[,:]...) :: a}
-@item @emph{Interface}: @tab @code{subroutine acc_delete(a, len)}
-@item @tab @code{type, dimension(:[,:]...) :: a}
-@item @tab @code{integer len}
-@item @emph{Interface}: @tab @code{subroutine acc_delete_async(a, async)}
-@item @tab @code{type, dimension(:[,:]...) :: a}
-@item @tab @code{integer(acc_handle_kind) :: async}
-@item @emph{Interface}: @tab @code{subroutine acc_delete_async(a, len, async)}
-@item @tab @code{type, dimension(:[,:]...) :: a}
-@item @tab @code{integer len}
-@item @tab @code{integer(acc_handle_kind) :: async}
-@item @emph{Interface}: @tab @code{subroutine acc_delete_finalize(a)}
-@item @tab @code{type, dimension(:[,:]...) :: a}
-@item @emph{Interface}: @tab @code{subroutine acc_delete_finalize(a, len)}
-@item @tab @code{type, dimension(:[,:]...) :: a}
-@item @tab @code{integer len}
-@item @emph{Interface}: @tab @code{subroutine acc_delete_async_finalize(a, async)}
-@item @tab @code{type, dimension(:[,:]...) :: a}
-@item @tab @code{integer(acc_handle_kind) :: async}
-@item @emph{Interface}: @tab @code{subroutine acc_delete_async_finalize(a, len, async)}
-@item @tab @code{type, dimension(:[,:]...) :: a}
-@item @tab @code{integer len}
-@item @tab @code{integer(acc_handle_kind) :: async}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.6}, section
-3.2.23.
-@end table
-
-
-
-@node acc_update_device
-@section @code{acc_update_device} -- Update device memory from mapped host memory.
-@table @asis
-@item @emph{Description}
-This function updates the device copy from the previously mapped host memory.
-The host memory is specified with the host address @var{a} and a length of
-@var{len} bytes.
-
-In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
-a contiguous array section. The second form @var{a} specifies a variable or
-array element and @var{len} specifies the length in bytes.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{acc_update_device(h_void *a, size_t len);}
-@item @emph{Prototype}: @tab @code{acc_update_device(h_void *a, size_t len, async);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{subroutine acc_update_device(a)}
-@item @tab @code{type, dimension(:[,:]...) :: a}
-@item @emph{Interface}: @tab @code{subroutine acc_update_device(a, len)}
-@item @tab @code{type, dimension(:[,:]...) :: a}
-@item @tab @code{integer len}
-@item @emph{Interface}: @tab @code{subroutine acc_update_device_async(a, async)}
-@item @tab @code{type, dimension(:[,:]...) :: a}
-@item @tab @code{integer(acc_handle_kind) :: async}
-@item @emph{Interface}: @tab @code{subroutine acc_update_device_async(a, len, async)}
-@item @tab @code{type, dimension(:[,:]...) :: a}
-@item @tab @code{integer len}
-@item @tab @code{integer(acc_handle_kind) :: async}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.6}, section
-3.2.24.
-@end table
-
-
-
-@node acc_update_self
-@section @code{acc_update_self} -- Update host memory from mapped device memory.
-@table @asis
-@item @emph{Description}
-This function updates the host copy from the previously mapped device memory.
-The host memory is specified with the host address @var{a} and a length of
-@var{len} bytes.
-
-In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
-a contiguous array section. The second form @var{a} specifies a variable or
-array element and @var{len} specifies the length in bytes.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{acc_update_self(h_void *a, size_t len);}
-@item @emph{Prototype}: @tab @code{acc_update_self_async(h_void *a, size_t len, int async);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{subroutine acc_update_self(a)}
-@item @tab @code{type, dimension(:[,:]...) :: a}
-@item @emph{Interface}: @tab @code{subroutine acc_update_self(a, len)}
-@item @tab @code{type, dimension(:[,:]...) :: a}
-@item @tab @code{integer len}
-@item @emph{Interface}: @tab @code{subroutine acc_update_self_async(a, async)}
-@item @tab @code{type, dimension(:[,:]...) :: a}
-@item @tab @code{integer(acc_handle_kind) :: async}
-@item @emph{Interface}: @tab @code{subroutine acc_update_self_async(a, len, async)}
-@item @tab @code{type, dimension(:[,:]...) :: a}
-@item @tab @code{integer len}
-@item @tab @code{integer(acc_handle_kind) :: async}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.6}, section
-3.2.25.
-@end table
-
-
-
-@node acc_map_data
-@section @code{acc_map_data} -- Map previously allocated device memory to host memory.
-@table @asis
-@item @emph{Description}
-This function maps previously allocated device and host memory. The device
-memory is specified with the device address @var{d}. The host memory is
-specified with the host address @var{h} and a length of @var{len}.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{acc_map_data(h_void *h, d_void *d, size_t len);}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.6}, section
-3.2.26.
-@end table
-
-
-
-@node acc_unmap_data
-@section @code{acc_unmap_data} -- Unmap device memory from host memory.
-@table @asis
-@item @emph{Description}
-This function unmaps previously mapped device and host memory. The latter
-specified by @var{h}.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{acc_unmap_data(h_void *h);}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.6}, section
-3.2.27.
-@end table
-
-
-
-@node acc_deviceptr
-@section @code{acc_deviceptr} -- Get device pointer associated with specific host address.
-@table @asis
-@item @emph{Description}
-This function returns the device address that has been mapped to the
-host address specified by @var{h}.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{void *acc_deviceptr(h_void *h);}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.6}, section
-3.2.28.
-@end table
-
-
-
-@node acc_hostptr
-@section @code{acc_hostptr} -- Get host pointer associated with specific device address.
-@table @asis
-@item @emph{Description}
-This function returns the host address that has been mapped to the
-device address specified by @var{d}.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{void *acc_hostptr(d_void *d);}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.6}, section
-3.2.29.
-@end table
-
-
-
-@node acc_is_present
-@section @code{acc_is_present} -- Indicate whether host variable / array is present on device.
-@table @asis
-@item @emph{Description}
-This function indicates whether the specified host address in @var{a} and a
-length of @var{len} bytes is present on the device. In C/C++, a non-zero
-value is returned to indicate the presence of the mapped memory on the
-device. A zero is returned to indicate the memory is not mapped on the
-device.
-
-In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
-a contiguous array section. The second form @var{a} specifies a variable or
-array element and @var{len} specifies the length in bytes. If the host
-memory is mapped to device memory, then a @code{true} is returned. Otherwise,
-a @code{false} is return to indicate the mapped memory is not present.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{int acc_is_present(h_void *a, size_t len);}
-@end multitable
-
-@item @emph{Fortran}:
-@multitable @columnfractions .20 .80
-@item @emph{Interface}: @tab @code{function acc_is_present(a)}
-@item @tab @code{type, dimension(:[,:]...) :: a}
-@item @tab @code{logical acc_is_present}
-@item @emph{Interface}: @tab @code{function acc_is_present(a, len)}
-@item @tab @code{type, dimension(:[,:]...) :: a}
-@item @tab @code{integer len}
-@item @tab @code{logical acc_is_present}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.6}, section
-3.2.30.
-@end table
-
-
-
-@node acc_memcpy_to_device
-@section @code{acc_memcpy_to_device} -- Copy host memory to device memory.
-@table @asis
-@item @emph{Description}
-This function copies host memory specified by host address of @var{src} to
-device memory specified by the device address @var{dest} for a length of
-@var{bytes} bytes.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{acc_memcpy_to_device(d_void *dest, h_void *src, size_t bytes);}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.6}, section
-3.2.31.
-@end table
-
-
-
-@node acc_memcpy_from_device
-@section @code{acc_memcpy_from_device} -- Copy device memory to host memory.
-@table @asis
-@item @emph{Description}
-This function copies host memory specified by host address of @var{src} from
-device memory specified by the device address @var{dest} for a length of
-@var{bytes} bytes.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{acc_memcpy_from_device(d_void *dest, h_void *src, size_t bytes);}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.6}, section
-3.2.32.
-@end table
-
-
-
-@node acc_attach
-@section @code{acc_attach} -- Let device pointer point to device-pointer target.
-@table @asis
-@item @emph{Description}
-This function updates a pointer on the device from pointing to a host-pointer
-address to pointing to the corresponding device data.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{acc_attach(h_void **ptr);}
-@item @emph{Prototype}: @tab @code{acc_attach_async(h_void **ptr, int async);}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.6}, section
-3.2.34.
-@end table
-
-
-
-@node acc_detach
-@section @code{acc_detach} -- Let device pointer point to host-pointer target.
-@table @asis
-@item @emph{Description}
-This function updates a pointer on the device from pointing to a device-pointer
-address to pointing to the corresponding host data.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{acc_detach(h_void **ptr);}
-@item @emph{Prototype}: @tab @code{acc_detach_async(h_void **ptr, int async);}
-@item @emph{Prototype}: @tab @code{acc_detach_finalize(h_void **ptr);}
-@item @emph{Prototype}: @tab @code{acc_detach_finalize_async(h_void **ptr, int async);}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.6}, section
-3.2.35.
-@end table
-
-
-
-@node acc_get_current_cuda_device
-@section @code{acc_get_current_cuda_device} -- Get CUDA device handle.
-@table @asis
-@item @emph{Description}
-This function returns the CUDA device handle. This handle is the same
-as used by the CUDA Runtime or Driver API's.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{void *acc_get_current_cuda_device(void);}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.6}, section
-A.2.1.1.
-@end table
-
-
-
-@node acc_get_current_cuda_context
-@section @code{acc_get_current_cuda_context} -- Get CUDA context handle.
-@table @asis
-@item @emph{Description}
-This function returns the CUDA context handle. This handle is the same
-as used by the CUDA Runtime or Driver API's.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{void *acc_get_current_cuda_context(void);}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.6}, section
-A.2.1.2.
-@end table
-
-
-
-@node acc_get_cuda_stream
-@section @code{acc_get_cuda_stream} -- Get CUDA stream handle.
-@table @asis
-@item @emph{Description}
-This function returns the CUDA stream handle for the queue @var{async}.
-This handle is the same as used by the CUDA Runtime or Driver API's.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{void *acc_get_cuda_stream(int async);}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.6}, section
-A.2.1.3.
-@end table
-
-
-
-@node acc_set_cuda_stream
-@section @code{acc_set_cuda_stream} -- Set CUDA stream handle.
-@table @asis
-@item @emph{Description}
-This function associates the stream handle specified by @var{stream} with
-the queue @var{async}.
-
-This cannot be used to change the stream handle associated with
-@code{acc_async_sync}.
-
-The return value is not specified.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{int acc_set_cuda_stream(int async, void *stream);}
-@end multitable
-
-@item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.6}, section
-A.2.1.4.
-@end table
-
-
-
-@node acc_prof_register
-@section @code{acc_prof_register} -- Register callbacks.
-@table @asis
-@item @emph{Description}:
-This function registers callbacks.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{void acc_prof_register (acc_event_t, acc_prof_callback, acc_register_t);}
-@end multitable
-
-@item @emph{See also}:
-@ref{OpenACC Profiling Interface}
-
-@item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.6}, section
-5.3.
-@end table
-
-
-
-@node acc_prof_unregister
-@section @code{acc_prof_unregister} -- Unregister callbacks.
-@table @asis
-@item @emph{Description}:
-This function unregisters callbacks.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{void acc_prof_unregister (acc_event_t, acc_prof_callback, acc_register_t);}
-@end multitable
-
-@item @emph{See also}:
-@ref{OpenACC Profiling Interface}
-
-@item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.6}, section
-5.3.
-@end table
-
-
-
-@node acc_prof_lookup
-@section @code{acc_prof_lookup} -- Obtain inquiry functions.
-@table @asis
-@item @emph{Description}:
-Function to obtain inquiry functions.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{acc_query_fn acc_prof_lookup (const char *);}
-@end multitable
-
-@item @emph{See also}:
-@ref{OpenACC Profiling Interface}
-
-@item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.6}, section
-5.3.
-@end table
-
-
-
-@node acc_register_library
-@section @code{acc_register_library} -- Library registration.
-@table @asis
-@item @emph{Description}:
-Function for library registration.
-
-@item @emph{C/C++}:
-@multitable @columnfractions .20 .80
-@item @emph{Prototype}: @tab @code{void acc_register_library (acc_prof_reg, acc_prof_reg, acc_prof_lookup_func);}
-@end multitable
-
-@item @emph{See also}:
-@ref{OpenACC Profiling Interface}, @ref{ACC_PROFLIB}
-
-@item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.6}, section
-5.3.
-@end table
-
-
-
-@c ---------------------------------------------------------------------
-@c OpenACC Environment Variables
-@c ---------------------------------------------------------------------
-
-@node OpenACC Environment Variables
-@chapter OpenACC Environment Variables
-
-The variables @env{ACC_DEVICE_TYPE} and @env{ACC_DEVICE_NUM}
-are defined by section 4 of the OpenACC specification in version 2.0.
-The variable @env{ACC_PROFLIB}
-is defined by section 4 of the OpenACC specification in version 2.6.
-The variable @env{GCC_ACC_NOTIFY} is used for diagnostic purposes.
-
-@menu
-* ACC_DEVICE_TYPE::
-* ACC_DEVICE_NUM::
-* ACC_PROFLIB::
-* GCC_ACC_NOTIFY::
-@end menu
-
-
-
-@node ACC_DEVICE_TYPE
-@section @code{ACC_DEVICE_TYPE}
-@table @asis
-@item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.6}, section
-4.1.
-@end table
-
-
-
-@node ACC_DEVICE_NUM
-@section @code{ACC_DEVICE_NUM}
-@table @asis
-@item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.6}, section
-4.2.
-@end table
-
-
-
-@node ACC_PROFLIB
-@section @code{ACC_PROFLIB}
-@table @asis
-@item @emph{See also}:
-@ref{acc_register_library}, @ref{OpenACC Profiling Interface}
-
-@item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.6}, section
-4.3.
-@end table
-
-
-
-@node GCC_ACC_NOTIFY
-@section @code{GCC_ACC_NOTIFY}
-@table @asis
-@item @emph{Description}:
-Print debug information pertaining to the accelerator.
-@end table
-
-
-
-@c ---------------------------------------------------------------------
-@c CUDA Streams Usage
-@c ---------------------------------------------------------------------
-
-@node CUDA Streams Usage
-@chapter CUDA Streams Usage
-
-This applies to the @code{nvptx} plugin only.
-
-The library provides elements that perform asynchronous movement of
-data and asynchronous operation of computing constructs. This
-asynchronous functionality is implemented by making use of CUDA
-streams@footnote{See "Stream Management" in "CUDA Driver API",
-TRM-06703-001, Version 5.5, for additional information}.
-
-The primary means by that the asynchronous functionality is accessed
-is through the use of those OpenACC directives which make use of the
-@code{async} and @code{wait} clauses. When the @code{async} clause is
-first used with a directive, it creates a CUDA stream. If an
-@code{async-argument} is used with the @code{async} clause, then the
-stream is associated with the specified @code{async-argument}.
-
-Following the creation of an association between a CUDA stream and the
-@code{async-argument} of an @code{async} clause, both the @code{wait}
-clause and the @code{wait} directive can be used. When either the
-clause or directive is used after stream creation, it creates a
-rendezvous point whereby execution waits until all operations
-associated with the @code{async-argument}, that is, stream, have
-completed.
-
-Normally, the management of the streams that are created as a result of
-using the @code{async} clause, is done without any intervention by the
-caller. This implies the association between the @code{async-argument}
-and the CUDA stream will be maintained for the lifetime of the program.
-However, this association can be changed through the use of the library
-function @code{acc_set_cuda_stream}. When the function
-@code{acc_set_cuda_stream} is called, the CUDA stream that was
-originally associated with the @code{async} clause will be destroyed.
-Caution should be taken when changing the association as subsequent
-references to the @code{async-argument} refer to a different
-CUDA stream.
-
-
-
-@c ---------------------------------------------------------------------
-@c OpenACC Library Interoperability
-@c ---------------------------------------------------------------------
-
-@node OpenACC Library Interoperability
-@chapter OpenACC Library Interoperability
-
-@section Introduction
-
-The OpenACC library uses the CUDA Driver API, and may interact with
-programs that use the Runtime library directly, or another library
-based on the Runtime library, e.g., CUBLAS@footnote{See section 2.26,
-"Interactions with the CUDA Driver API" in
-"CUDA Runtime API", Version 5.5, and section 2.27, "VDPAU
-Interoperability", in "CUDA Driver API", TRM-06703-001, Version 5.5,
-for additional information on library interoperability.}.
-This chapter describes the use cases and what changes are
-required in order to use both the OpenACC library and the CUBLAS and Runtime
-libraries within a program.
-
-@section First invocation: NVIDIA CUBLAS library API
-
-In this first use case (see below), a function in the CUBLAS library is called
-prior to any of the functions in the OpenACC library. More specifically, the
-function @code{cublasCreate()}.
-
-When invoked, the function initializes the library and allocates the
-hardware resources on the host and the device on behalf of the caller. Once
-the initialization and allocation has completed, a handle is returned to the
-caller. The OpenACC library also requires initialization and allocation of
-hardware resources. Since the CUBLAS library has already allocated the
-hardware resources for the device, all that is left to do is to initialize
-the OpenACC library and acquire the hardware resources on the host.
-
-Prior to calling the OpenACC function that initializes the library and
-allocate the host hardware resources, you need to acquire the device number
-that was allocated during the call to @code{cublasCreate()}. The invoking of the
-runtime library function @code{cudaGetDevice()} accomplishes this. Once
-acquired, the device number is passed along with the device type as
-parameters to the OpenACC library function @code{acc_set_device_num()}.
-
-Once the call to @code{acc_set_device_num()} has completed, the OpenACC
-library uses the context that was created during the call to
-@code{cublasCreate()}. In other words, both libraries will be sharing the
-same context.
-
-@smallexample
- /* Create the handle */
- s = cublasCreate(&h);
- if (s != CUBLAS_STATUS_SUCCESS)
- @{
- fprintf(stderr, "cublasCreate failed %d\n", s);
- exit(EXIT_FAILURE);
- @}
-
- /* Get the device number */
- e = cudaGetDevice(&dev);
- if (e != cudaSuccess)
- @{
- fprintf(stderr, "cudaGetDevice failed %d\n", e);
- exit(EXIT_FAILURE);
- @}
-
- /* Initialize OpenACC library and use device 'dev' */
- acc_set_device_num(dev, acc_device_nvidia);
-
-@end smallexample
-@center Use Case 1
-
-@section First invocation: OpenACC library API
-
-In this second use case (see below), a function in the OpenACC library is
-called prior to any of the functions in the CUBLAS library. More specificially,
-the function @code{acc_set_device_num()}.
-
-In the use case presented here, the function @code{acc_set_device_num()}
-is used to both initialize the OpenACC library and allocate the hardware
-resources on the host and the device. In the call to the function, the
-call parameters specify which device to use and what device
-type to use, i.e., @code{acc_device_nvidia}. It should be noted that this
-is but one method to initialize the OpenACC library and allocate the
-appropriate hardware resources. Other methods are available through the
-use of environment variables and these will be discussed in the next section.
-
-Once the call to @code{acc_set_device_num()} has completed, other OpenACC
-functions can be called as seen with multiple calls being made to
-@code{acc_copyin()}. In addition, calls can be made to functions in the
-CUBLAS library. In the use case a call to @code{cublasCreate()} is made
-subsequent to the calls to @code{acc_copyin()}.
-As seen in the previous use case, a call to @code{cublasCreate()}
-initializes the CUBLAS library and allocates the hardware resources on the
-host and the device. However, since the device has already been allocated,
-@code{cublasCreate()} will only initialize the CUBLAS library and allocate
-the appropriate hardware resources on the host. The context that was created
-as part of the OpenACC initialization is shared with the CUBLAS library,
-similarly to the first use case.
-
-@smallexample
- dev = 0;
-
- acc_set_device_num(dev, acc_device_nvidia);
-
- /* Copy the first set to the device */
- d_X = acc_copyin(&h_X[0], N * sizeof (float));
- if (d_X == NULL)
- @{
- fprintf(stderr, "copyin error h_X\n");
- exit(EXIT_FAILURE);
- @}
-
- /* Copy the second set to the device */
- d_Y = acc_copyin(&h_Y1[0], N * sizeof (float));
- if (d_Y == NULL)
- @{
- fprintf(stderr, "copyin error h_Y1\n");
- exit(EXIT_FAILURE);
- @}
-
- /* Create the handle */
- s = cublasCreate(&h);
- if (s != CUBLAS_STATUS_SUCCESS)
- @{
- fprintf(stderr, "cublasCreate failed %d\n", s);
- exit(EXIT_FAILURE);
- @}
-
- /* Perform saxpy using CUBLAS library function */
- s = cublasSaxpy(h, N, &alpha, d_X, 1, d_Y, 1);
- if (s != CUBLAS_STATUS_SUCCESS)
- @{
- fprintf(stderr, "cublasSaxpy failed %d\n", s);
- exit(EXIT_FAILURE);
- @}
-
- /* Copy the results from the device */
- acc_memcpy_from_device(&h_Y1[0], d_Y, N * sizeof (float));
-
-@end smallexample
-@center Use Case 2
-
-@section OpenACC library and environment variables
-
-There are two environment variables associated with the OpenACC library
-that may be used to control the device type and device number:
-@env{ACC_DEVICE_TYPE} and @env{ACC_DEVICE_NUM}, respectively. These two
-environment variables can be used as an alternative to calling
-@code{acc_set_device_num()}. As seen in the second use case, the device
-type and device number were specified using @code{acc_set_device_num()}.
-If however, the aforementioned environment variables were set, then the
-call to @code{acc_set_device_num()} would not be required.
-
-
-The use of the environment variables is only relevant when an OpenACC function
-is called prior to a call to @code{cudaCreate()}. If @code{cudaCreate()}
-is called prior to a call to an OpenACC function, then you must call
-@code{acc_set_device_num()}@footnote{More complete information
-about @env{ACC_DEVICE_TYPE} and @env{ACC_DEVICE_NUM} can be found in
-sections 4.1 and 4.2 of the @uref{https://www.openacc.org, OpenACC}
-Application Programming Interfaceā€¯, Version 2.6.}
-
-
-
-@c ---------------------------------------------------------------------
-@c OpenACC Profiling Interface
-@c ---------------------------------------------------------------------
-
-@node OpenACC Profiling Interface
-@chapter OpenACC Profiling Interface
-
-@section Implementation Status and Implementation-Defined Behavior
-
-We're implementing the OpenACC Profiling Interface as defined by the
-OpenACC 2.6 specification. We're clarifying some aspects here as
-@emph{implementation-defined behavior}, while they're still under
-discussion within the OpenACC Technical Committee.
-
-This implementation is tuned to keep the performance impact as low as
-possible for the (very common) case that the Profiling Interface is
-not enabled. This is relevant, as the Profiling Interface affects all
-the @emph{hot} code paths (in the target code, not in the offloaded
-code). Users of the OpenACC Profiling Interface can be expected to
-understand that performance will be impacted to some degree once the
-Profiling Interface has gotten enabled: for example, because of the
-@emph{runtime} (libgomp) calling into a third-party @emph{library} for
-every event that has been registered.
-
-We're not yet accounting for the fact that @cite{OpenACC events may
-occur during event processing}.
-We just handle one case specially, as required by CUDA 9.0
-@command{nvprof}, that @code{acc_get_device_type}
-(@ref{acc_get_device_type})) may be called from
-@code{acc_ev_device_init_start}, @code{acc_ev_device_init_end}
-callbacks.
-
-We're not yet implementing initialization via a
-@code{acc_register_library} function that is either statically linked
-in, or dynamically via @env{LD_PRELOAD}.
-Initialization via @code{acc_register_library} functions dynamically
-loaded via the @env{ACC_PROFLIB} environment variable does work, as
-does directly calling @code{acc_prof_register},
-@code{acc_prof_unregister}, @code{acc_prof_lookup}.
-
-As currently there are no inquiry functions defined, calls to
-@code{acc_prof_lookup} will always return @code{NULL}.
-
-There aren't separate @emph{start}, @emph{stop} events defined for the
-event types @code{acc_ev_create}, @code{acc_ev_delete},
-@code{acc_ev_alloc}, @code{acc_ev_free}. It's not clear if these
-should be triggered before or after the actual device-specific call is
-made. We trigger them after.
-
-Remarks about data provided to callbacks:
-
-@table @asis
-
-@item @code{acc_prof_info.event_type}
-It's not clear if for @emph{nested} event callbacks (for example,
-@code{acc_ev_enqueue_launch_start} as part of a parent compute
-construct), this should be set for the nested event
-(@code{acc_ev_enqueue_launch_start}), or if the value of the parent
-construct should remain (@code{acc_ev_compute_construct_start}). In
-this implementation, the value will generally correspond to the
-innermost nested event type.
-
-@item @code{acc_prof_info.device_type}
-@itemize
-
-@item
-For @code{acc_ev_compute_construct_start}, and in presence of an
-@code{if} clause with @emph{false} argument, this will still refer to
-the offloading device type.
-It's not clear if that's the expected behavior.
-
-@item
-Complementary to the item before, for
-@code{acc_ev_compute_construct_end}, this is set to
-@code{acc_device_host} in presence of an @code{if} clause with
-@emph{false} argument.
-It's not clear if that's the expected behavior.
-
-@end itemize
-
-@item @code{acc_prof_info.thread_id}
-Always @code{-1}; not yet implemented.
-
-@item @code{acc_prof_info.async}
-@itemize
-
-@item
-Not yet implemented correctly for
-@code{acc_ev_compute_construct_start}.
-
-@item
-In a compute construct, for host-fallback
-execution/@code{acc_device_host} it will always be
-@code{acc_async_sync}.
-It's not clear if that's the expected behavior.
-
-@item
-For @code{acc_ev_device_init_start} and @code{acc_ev_device_init_end},
-it will always be @code{acc_async_sync}.
-It's not clear if that's the expected behavior.
-
-@end itemize
-
-@item @code{acc_prof_info.async_queue}
-There is no @cite{limited number of asynchronous queues} in libgomp.
-This will always have the same value as @code{acc_prof_info.async}.
-
-@item @code{acc_prof_info.src_file}
-Always @code{NULL}; not yet implemented.
-
-@item @code{acc_prof_info.func_name}
-Always @code{NULL}; not yet implemented.
-
-@item @code{acc_prof_info.line_no}
-Always @code{-1}; not yet implemented.
-
-@item @code{acc_prof_info.end_line_no}
-Always @code{-1}; not yet implemented.
-
-@item @code{acc_prof_info.func_line_no}
-Always @code{-1}; not yet implemented.
-
-@item @code{acc_prof_info.func_end_line_no}
-Always @code{-1}; not yet implemented.
-
-@item @code{acc_event_info.event_type}, @code{acc_event_info.*.event_type}
-Relating to @code{acc_prof_info.event_type} discussed above, in this
-implementation, this will always be the same value as
-@code{acc_prof_info.event_type}.
-
-@item @code{acc_event_info.*.parent_construct}
-@itemize
-
-@item
-Will be @code{acc_construct_parallel} for all OpenACC compute
-constructs as well as many OpenACC Runtime API calls; should be the
-one matching the actual construct, or
-@code{acc_construct_runtime_api}, respectively.
-
-@item
-Will be @code{acc_construct_enter_data} or
-@code{acc_construct_exit_data} when processing variable mappings
-specified in OpenACC @emph{declare} directives; should be
-@code{acc_construct_declare}.
-
-@item
-For implicit @code{acc_ev_device_init_start},
-@code{acc_ev_device_init_end}, and explicit as well as implicit
-@code{acc_ev_alloc}, @code{acc_ev_free},
-@code{acc_ev_enqueue_upload_start}, @code{acc_ev_enqueue_upload_end},
-@code{acc_ev_enqueue_download_start}, and
-@code{acc_ev_enqueue_download_end}, will be
-@code{acc_construct_parallel}; should reflect the real parent
-construct.
-
-@end itemize
-
-@item @code{acc_event_info.*.implicit}
-For @code{acc_ev_alloc}, @code{acc_ev_free},
-@code{acc_ev_enqueue_upload_start}, @code{acc_ev_enqueue_upload_end},
-@code{acc_ev_enqueue_download_start}, and
-@code{acc_ev_enqueue_download_end}, this currently will be @code{1}
-also for explicit usage.
-
-@item @code{acc_event_info.data_event.var_name}
-Always @code{NULL}; not yet implemented.
-
-@item @code{acc_event_info.data_event.host_ptr}
-For @code{acc_ev_alloc}, and @code{acc_ev_free}, this is always
-@code{NULL}.
-
-@item @code{typedef union acc_api_info}
-@dots{} as printed in @cite{5.2.3. Third Argument: API-Specific
-Information}. This should obviously be @code{typedef @emph{struct}
-acc_api_info}.
-
-@item @code{acc_api_info.device_api}
-Possibly not yet implemented correctly for
-@code{acc_ev_compute_construct_start},
-@code{acc_ev_device_init_start}, @code{acc_ev_device_init_end}:
-will always be @code{acc_device_api_none} for these event types.
-For @code{acc_ev_enter_data_start}, it will be
-@code{acc_device_api_none} in some cases.
-
-@item @code{acc_api_info.device_type}
-Always the same as @code{acc_prof_info.device_type}.
-
-@item @code{acc_api_info.vendor}
-Always @code{-1}; not yet implemented.
-
-@item @code{acc_api_info.device_handle}
-Always @code{NULL}; not yet implemented.
-
-@item @code{acc_api_info.context_handle}
-Always @code{NULL}; not yet implemented.
-
-@item @code{acc_api_info.async_handle}
-Always @code{NULL}; not yet implemented.
-
-@end table
-
-Remarks about certain event types:
-
-@table @asis
-
-@item @code{acc_ev_device_init_start}, @code{acc_ev_device_init_end}
-@itemize
-
-@item
-@c See 'DEVICE_INIT_INSIDE_COMPUTE_CONSTRUCT' in
-@c 'libgomp.oacc-c-c++-common/acc_prof-kernels-1.c',
-@c 'libgomp.oacc-c-c++-common/acc_prof-parallel-1.c'.
-When a compute construct triggers implicit
-@code{acc_ev_device_init_start} and @code{acc_ev_device_init_end}
-events, they currently aren't @emph{nested within} the corresponding
-@code{acc_ev_compute_construct_start} and
-@code{acc_ev_compute_construct_end}, but they're currently observed
-@emph{before} @code{acc_ev_compute_construct_start}.
-It's not clear what to do: the standard asks us provide a lot of
-details to the @code{acc_ev_compute_construct_start} callback, without
-(implicitly) initializing a device before?
-
-@item
-Callbacks for these event types will not be invoked for calls to the
-@code{acc_set_device_type} and @code{acc_set_device_num} functions.
-It's not clear if they should be.
-
-@end itemize
-
-@item @code{acc_ev_enter_data_start}, @code{acc_ev_enter_data_end}, @code{acc_ev_exit_data_start}, @code{acc_ev_exit_data_end}
-@itemize
-
-@item
-Callbacks for these event types will also be invoked for OpenACC
-@emph{host_data} constructs.
-It's not clear if they should be.
-
-@item
-Callbacks for these event types will also be invoked when processing
-variable mappings specified in OpenACC @emph{declare} directives.
-It's not clear if they should be.
-
-@end itemize
-
-@end table
-
-Callbacks for the following event types will be invoked, but dispatch
-and information provided therein has not yet been thoroughly reviewed:
-
-@itemize
-@item @code{acc_ev_alloc}
-@item @code{acc_ev_free}
-@item @code{acc_ev_update_start}, @code{acc_ev_update_end}
-@item @code{acc_ev_enqueue_upload_start}, @code{acc_ev_enqueue_upload_end}
-@item @code{acc_ev_enqueue_download_start}, @code{acc_ev_enqueue_download_end}
-@end itemize
-
-During device initialization, and finalization, respectively,
-callbacks for the following event types will not yet be invoked:
-
-@itemize
-@item @code{acc_ev_alloc}
-@item @code{acc_ev_free}
-@end itemize
-
-Callbacks for the following event types have not yet been implemented,
-so currently won't be invoked:
-
-@itemize
-@item @code{acc_ev_device_shutdown_start}, @code{acc_ev_device_shutdown_end}
-@item @code{acc_ev_runtime_shutdown}
-@item @code{acc_ev_create}, @code{acc_ev_delete}
-@item @code{acc_ev_wait_start}, @code{acc_ev_wait_end}
-@end itemize
-
-For the following runtime library functions, not all expected
-callbacks will be invoked (mostly concerning implicit device
-initialization):
-
-@itemize
-@item @code{acc_get_num_devices}
-@item @code{acc_set_device_type}
-@item @code{acc_get_device_type}
-@item @code{acc_set_device_num}
-@item @code{acc_get_device_num}
-@item @code{acc_init}
-@item @code{acc_shutdown}
-@end itemize
-
-Aside from implicit device initialization, for the following runtime
-library functions, no callbacks will be invoked for shared-memory
-offloading devices (it's not clear if they should be):
-
-@itemize
-@item @code{acc_malloc}
-@item @code{acc_free}
-@item @code{acc_copyin}, @code{acc_present_or_copyin}, @code{acc_copyin_async}
-@item @code{acc_create}, @code{acc_present_or_create}, @code{acc_create_async}
-@item @code{acc_copyout}, @code{acc_copyout_async}, @code{acc_copyout_finalize}, @code{acc_copyout_finalize_async}
-@item @code{acc_delete}, @code{acc_delete_async}, @code{acc_delete_finalize}, @code{acc_delete_finalize_async}
-@item @code{acc_update_device}, @code{acc_update_device_async}
-@item @code{acc_update_self}, @code{acc_update_self_async}
-@item @code{acc_map_data}, @code{acc_unmap_data}
-@item @code{acc_memcpy_to_device}, @code{acc_memcpy_to_device_async}
-@item @code{acc_memcpy_from_device}, @code{acc_memcpy_from_device_async}
-@end itemize
-
-@c ---------------------------------------------------------------------
-@c OpenMP-Implementation Specifics
-@c ---------------------------------------------------------------------
-
-@node OpenMP-Implementation Specifics
-@chapter OpenMP-Implementation Specifics
-
-@menu
-* OpenMP Context Selectors::
-* Memory allocation with libmemkind::
-@end menu
-
-@node OpenMP Context Selectors
-@section OpenMP Context Selectors
-
-@code{vendor} is always @code{gnu}. References are to the GCC manual.
-
-@multitable @columnfractions .60 .10 .25
-@headitem @code{arch} @tab @code{kind} @tab @code{isa}
-@item @code{x86}, @code{x86_64}, @code{i386}, @code{i486},
- @code{i586}, @code{i686}, @code{ia32}
- @tab @code{host}
- @tab See @code{-m...} flags in ``x86 Options'' (without @code{-m})
-@item @code{amdgcn}, @code{gcn}
- @tab @code{gpu}
- @tab See @code{-march=} in ``AMD GCN Options''
-@item @code{nvptx}
- @tab @code{gpu}
- @tab See @code{-march=} in ``Nvidia PTX Options''
-@end multitable
-
-@node Memory allocation with libmemkind
-@section Memory allocation with libmemkind
-
-On Linux systems, where the @uref{https://github.com/memkind/memkind, memkind
-library} (@code{libmemkind.so.0}) is available at runtime, it is used when
-creating memory allocators requesting
-
-@itemize
-@item the memory space @code{omp_high_bw_mem_space}
-@item the memory space @code{omp_large_cap_mem_space}
-@item the partition trait @code{omp_atv_interleaved}
-@end itemize
-
-
-@c ---------------------------------------------------------------------
-@c Offload-Target Specifics
-@c ---------------------------------------------------------------------
-
-@node Offload-Target Specifics
-@chapter Offload-Target Specifics
-
-The following sections present notes on the offload-target specifics
-
-@menu
-* AMD Radeon::
-* nvptx::
-@end menu
-
-@node AMD Radeon
-@section AMD Radeon (GCN)
-
-On the hardware side, there is the hierarchy (fine to coarse):
-@itemize
-@item work item (thread)
-@item wavefront
-@item work group
-@item compute unite (CU)
-@end itemize
-
-All OpenMP and OpenACC levels are used, i.e.
-@itemize
-@item OpenMP's simd and OpenACC's vector map to work items (thread)
-@item OpenMP's threads (``parallel'') and OpenACC's workers map
- to wavefronts
-@item OpenMP's teams and OpenACC's gang use a threadpool with the
- size of the number of teams or gangs, respectively.
-@end itemize
-
-The used sizes are
-@itemize
-@item Number of teams is the specified @code{num_teams} (OpenMP) or
- @code{num_gangs} (OpenACC) or otherwise the number of CU
-@item Number of wavefronts is 4 for gfx900 and 16 otherwise;
- @code{num_threads} (OpenMP) and @code{num_workers} (OpenACC)
- overrides this if smaller.
-@item The wavefront has 102 scalars and 64 vectors
-@item Number of workitems is always 64
-@item The hardware permits maximally 40 workgroups/CU and
- 16 wavefronts/workgroup up to a limit of 40 wavefronts in total per CU.
-@item 80 scalars registers and 24 vector registers in non-kernel functions
- (the chosen procedure-calling API).
-@item For the kernel itself: as many as register pressure demands (number of
- teams and number of threads, scaled down if registers are exhausted)
-@end itemize
-
-The implementation remark:
-@itemize
-@item I/O within OpenMP target regions and OpenACC parallel/kernels is supported
- using the C library @code{printf} functions and the Fortran
- @code{print}/@code{write} statements.
-@end itemize
-
-
-
-@node nvptx
-@section nvptx
-
-On the hardware side, there is the hierarchy (fine to coarse):
-@itemize
-@item thread
-@item warp
-@item thread block
-@item streaming multiprocessor
-@end itemize
-
-All OpenMP and OpenACC levels are used, i.e.
-@itemize
-@item OpenMP's simd and OpenACC's vector map to threads
-@item OpenMP's threads (``parallel'') and OpenACC's workers map to warps
-@item OpenMP's teams and OpenACC's gang use a threadpool with the
- size of the number of teams or gangs, respectively.
-@end itemize
-
-The used sizes are
-@itemize
-@item The @code{warp_size} is always 32
-@item CUDA kernel launched: @code{dim=@{#teams,1,1@}, blocks=@{#threads,warp_size,1@}}.
-@end itemize
-
-Additional information can be obtained by setting the environment variable to
-@code{GOMP_DEBUG=1} (very verbose; grep for @code{kernel.*launch} for launch
-parameters).
-
-GCC generates generic PTX ISA code, which is just-in-time compiled by CUDA,
-which caches the JIT in the user's directory (see CUDA documentation; can be
-tuned by the environment variables @code{CUDA_CACHE_@{DISABLE,MAXSIZE,PATH@}}.
-
-Note: While PTX ISA is generic, the @code{-mptx=} and @code{-march=} commandline
-options still affect the used PTX ISA code and, thus, the requirments on
-CUDA version and hardware.
-
-The implementation remark:
-@itemize
-@item I/O within OpenMP target regions and OpenACC parallel/kernels is supported
- using the C library @code{printf} functions. Note that the Fortran
- @code{print}/@code{write} statements are not supported, yet.
-@item Compilation OpenMP code that contains @code{requires reverse_offload}
- requires at least @code{-march=sm_35}, compiling for @code{-march=sm_30}
- is not supported.
-@end itemize
-
-
-@c ---------------------------------------------------------------------
-@c The libgomp ABI
-@c ---------------------------------------------------------------------
-
-@node The libgomp ABI
-@chapter The libgomp ABI
-
-The following sections present notes on the external ABI as
-presented by libgomp. Only maintainers should need them.
-
-@menu
-* Implementing MASTER construct::
-* Implementing CRITICAL construct::
-* Implementing ATOMIC construct::
-* Implementing FLUSH construct::
-* Implementing BARRIER construct::
-* Implementing THREADPRIVATE construct::
-* Implementing PRIVATE clause::
-* Implementing FIRSTPRIVATE LASTPRIVATE COPYIN and COPYPRIVATE clauses::
-* Implementing REDUCTION clause::
-* Implementing PARALLEL construct::
-* Implementing FOR construct::
-* Implementing ORDERED construct::
-* Implementing SECTIONS construct::
-* Implementing SINGLE construct::
-* Implementing OpenACC's PARALLEL construct::
-@end menu
-
-
-@node Implementing MASTER construct
-@section Implementing MASTER construct
-
-@smallexample
-if (omp_get_thread_num () == 0)
- block
-@end smallexample
-
-Alternately, we generate two copies of the parallel subfunction
-and only include this in the version run by the primary thread.
-Surely this is not worthwhile though...
-
-
-
-@node Implementing CRITICAL construct
-@section Implementing CRITICAL construct
-
-Without a specified name,
-
-@smallexample
- void GOMP_critical_start (void);
- void GOMP_critical_end (void);
-@end smallexample
-
-so that we don't get COPY relocations from libgomp to the main
-application.
-
-With a specified name, use omp_set_lock and omp_unset_lock with
-name being transformed into a variable declared like
-
-@smallexample
- omp_lock_t gomp_critical_user_<name> __attribute__((common))
-@end smallexample
-
-Ideally the ABI would specify that all zero is a valid unlocked
-state, and so we wouldn't need to initialize this at
-startup.
-
-
-
-@node Implementing ATOMIC construct
-@section Implementing ATOMIC construct
-
-The target should implement the @code{__sync} builtins.
-
-Failing that we could add
-
-@smallexample
- void GOMP_atomic_enter (void)
- void GOMP_atomic_exit (void)
-@end smallexample
-
-which reuses the regular lock code, but with yet another lock
-object private to the library.
-
-
-
-@node Implementing FLUSH construct
-@section Implementing FLUSH construct
-
-Expands to the @code{__sync_synchronize} builtin.
-
-
-
-@node Implementing BARRIER construct
-@section Implementing BARRIER construct
-
-@smallexample
- void GOMP_barrier (void)
-@end smallexample
-
-
-@node Implementing THREADPRIVATE construct
-@section Implementing THREADPRIVATE construct
-
-In _most_ cases we can map this directly to @code{__thread}. Except
-that OMP allows constructors for C++ objects. We can either
-refuse to support this (how often is it used?) or we can
-implement something akin to .ctors.
-
-Even more ideally, this ctor feature is handled by extensions
-to the main pthreads library. Failing that, we can have a set
-of entry points to register ctor functions to be called.
-
-
-
-@node Implementing PRIVATE clause
-@section Implementing PRIVATE clause
-
-In association with a PARALLEL, or within the lexical extent
-of a PARALLEL block, the variable becomes a local variable in
-the parallel subfunction.
-
-In association with FOR or SECTIONS blocks, create a new
-automatic variable within the current function. This preserves
-the semantic of new variable creation.
-
-
-
-@node Implementing FIRSTPRIVATE LASTPRIVATE COPYIN and COPYPRIVATE clauses
-@section Implementing FIRSTPRIVATE LASTPRIVATE COPYIN and COPYPRIVATE clauses
-
-This seems simple enough for PARALLEL blocks. Create a private
-struct for communicating between the parent and subfunction.
-In the parent, copy in values for scalar and "small" structs;
-copy in addresses for others TREE_ADDRESSABLE types. In the
-subfunction, copy the value into the local variable.
-
-It is not clear what to do with bare FOR or SECTION blocks.
-The only thing I can figure is that we do something like:
-
-@smallexample
-#pragma omp for firstprivate(x) lastprivate(y)
-for (int i = 0; i < n; ++i)
- body;
-@end smallexample
-
-which becomes
-
-@smallexample
-@{
- int x = x, y;
-
- // for stuff
-
- if (i == n)
- y = y;
-@}
-@end smallexample
-
-where the "x=x" and "y=y" assignments actually have different
-uids for the two variables, i.e. not something you could write
-directly in C. Presumably this only makes sense if the "outer"
-x and y are global variables.
-
-COPYPRIVATE would work the same way, except the structure
-broadcast would have to happen via SINGLE machinery instead.
-
-
-
-@node Implementing REDUCTION clause
-@section Implementing REDUCTION clause
-
-The private struct mentioned in the previous section should have
-a pointer to an array of the type of the variable, indexed by the
-thread's @var{team_id}. The thread stores its final value into the
-array, and after the barrier, the primary thread iterates over the
-array to collect the values.
-
-
-@node Implementing PARALLEL construct
-@section Implementing PARALLEL construct
-
-@smallexample
- #pragma omp parallel
- @{
- body;
- @}
-@end smallexample
-
-becomes
-
-@smallexample
- void subfunction (void *data)
- @{
- use data;
- body;
- @}
-
- setup data;
- GOMP_parallel_start (subfunction, &data, num_threads);
- subfunction (&data);
- GOMP_parallel_end ();
-@end smallexample
-
-@smallexample
- void GOMP_parallel_start (void (*fn)(void *), void *data, unsigned num_threads)
-@end smallexample
-
-The @var{FN} argument is the subfunction to be run in parallel.
-
-The @var{DATA} argument is a pointer to a structure used to
-communicate data in and out of the subfunction, as discussed
-above with respect to FIRSTPRIVATE et al.
-
-The @var{NUM_THREADS} argument is 1 if an IF clause is present
-and false, or the value of the NUM_THREADS clause, if
-present, or 0.
-
-The function needs to create the appropriate number of
-threads and/or launch them from the dock. It needs to
-create the team structure and assign team ids.
-
-@smallexample
- void GOMP_parallel_end (void)
-@end smallexample
-
-Tears down the team and returns us to the previous @code{omp_in_parallel()} state.
-
-
-
-@node Implementing FOR construct
-@section Implementing FOR construct
-
-@smallexample
- #pragma omp parallel for
- for (i = lb; i <= ub; i++)
- body;
-@end smallexample
-
-becomes
-
-@smallexample
- void subfunction (void *data)
- @{
- long _s0, _e0;
- while (GOMP_loop_static_next (&_s0, &_e0))
- @{
- long _e1 = _e0, i;
- for (i = _s0; i < _e1; i++)
- body;
- @}
- GOMP_loop_end_nowait ();
- @}
-
- GOMP_parallel_loop_static (subfunction, NULL, 0, lb, ub+1, 1, 0);
- subfunction (NULL);
- GOMP_parallel_end ();
-@end smallexample
-
-@smallexample
- #pragma omp for schedule(runtime)
- for (i = 0; i < n; i++)
- body;
-@end smallexample
-
-becomes
-
-@smallexample
- @{
- long i, _s0, _e0;
- if (GOMP_loop_runtime_start (0, n, 1, &_s0, &_e0))
- do @{
- long _e1 = _e0;
- for (i = _s0, i < _e0; i++)
- body;
- @} while (GOMP_loop_runtime_next (&_s0, _&e0));
- GOMP_loop_end ();
- @}
-@end smallexample
-
-Note that while it looks like there is trickiness to propagating
-a non-constant STEP, there isn't really. We're explicitly allowed
-to evaluate it as many times as we want, and any variables involved
-should automatically be handled as PRIVATE or SHARED like any other
-variables. So the expression should remain evaluable in the
-subfunction. We can also pull it into a local variable if we like,
-but since its supposed to remain unchanged, we can also not if we like.
-
-If we have SCHEDULE(STATIC), and no ORDERED, then we ought to be
-able to get away with no work-sharing context at all, since we can
-simply perform the arithmetic directly in each thread to divide up
-the iterations. Which would mean that we wouldn't need to call any
-of these routines.
-
-There are separate routines for handling loops with an ORDERED
-clause. Bookkeeping for that is non-trivial...
-
-
-
-@node Implementing ORDERED construct
-@section Implementing ORDERED construct
-
-@smallexample
- void GOMP_ordered_start (void)
- void GOMP_ordered_end (void)
-@end smallexample
-
-
-
-@node Implementing SECTIONS construct
-@section Implementing SECTIONS construct
-
-A block as
-
-@smallexample
- #pragma omp sections
- @{
- #pragma omp section
- stmt1;
- #pragma omp section
- stmt2;
- #pragma omp section
- stmt3;
- @}
-@end smallexample
-
-becomes
-
-@smallexample
- for (i = GOMP_sections_start (3); i != 0; i = GOMP_sections_next ())
- switch (i)
- @{
- case 1:
- stmt1;
- break;
- case 2:
- stmt2;
- break;
- case 3:
- stmt3;
- break;
- @}
- GOMP_barrier ();
-@end smallexample
-
-
-@node Implementing SINGLE construct
-@section Implementing SINGLE construct
-
-A block like
-
-@smallexample
- #pragma omp single
- @{
- body;
- @}
-@end smallexample
-
-becomes
-
-@smallexample
- if (GOMP_single_start ())
- body;
- GOMP_barrier ();
-@end smallexample
-
-while
-
-@smallexample
- #pragma omp single copyprivate(x)
- body;
-@end smallexample
-
-becomes
-
-@smallexample
- datap = GOMP_single_copy_start ();
- if (datap == NULL)
- @{
- body;
- data.x = x;
- GOMP_single_copy_end (&data);
- @}
- else
- x = datap->x;
- GOMP_barrier ();
-@end smallexample
-
-
-
-@node Implementing OpenACC's PARALLEL construct
-@section Implementing OpenACC's PARALLEL construct
-
-@smallexample
- void GOACC_parallel ()
-@end smallexample
-
-
-
-@c ---------------------------------------------------------------------
-@c Reporting Bugs
-@c ---------------------------------------------------------------------
-
-@node Reporting Bugs
-@chapter Reporting Bugs
-
-Bugs in the GNU Offloading and Multi Processing Runtime Library should
-be reported via @uref{https://gcc.gnu.org/bugzilla/, Bugzilla}. Please add
-"openacc", or "openmp", or both to the keywords field in the bug
-report, as appropriate.
-
-
-
-@c ---------------------------------------------------------------------
-@c GNU General Public License
-@c ---------------------------------------------------------------------
-
-@include gpl_v3.texi
-
-
-
-@c ---------------------------------------------------------------------
-@c GNU Free Documentation License
-@c ---------------------------------------------------------------------
-
-@include fdl.texi
-
-
-
-@c ---------------------------------------------------------------------
-@c Funding Free Software
-@c ---------------------------------------------------------------------
-
-@include funding.texi
-
-@c ---------------------------------------------------------------------
-@c Index
-@c ---------------------------------------------------------------------
-
-@node Library Index
-@unnumbered Library Index
-
-@printindex cp
-
-@bye