88
99# OpenMP dialect: Fortran descriptor type mapping for offload
1010
11- The descriptor mapping for OpenMP currently works differently to the planned direction for OpenACC, however,
12- it is possible and would likely be ideal to align the method with OpenACC in the future. However, at least
13- currently the OpenMP specification is less descriptive and has less stringent rules around descriptor based
14- types so does not require as complex a set of descriptor management rules (although, in certain cases
15- for the interim adopting OpenACC's rules where it makes sense could be useful).
16-
1711The initial method for mapping Fortran types tied to descriptors for OpenMP offloading is to treat these types
1812as a special case of OpenMP record type (C/C++ structure/class, Fortran derived type etc.) mapping as far as the
1913runtime is concerned. Where the box (descriptor information) is the holding container and the underlying
2014data pointer is contained within the container, and we must generate explicit maps for both the pointer member and
21- the container. As an example, a small C++ program that is equivalent to the concept described:
15+ the container. As an example, a small C++ program that is equivalent to the concept described, with the
16+ ` mock_descriptor ` class being representative of the class utilised for descriptors in Clang:
2217
2318``` C++
2419struct mock_descriptor {
@@ -49,15 +44,15 @@ Currently, Flang will lower these descriptor types in the OpenMP lowering (lower
4944to all other map types, generating an omp.MapInfoOp containing relevant information required for lowering
5045the OpenMP dialect to LLVM-IR during the final stages of the MLIR lowering. However, after
5146the lowering to FIR/HLFIR has been performed an OpenMP dialect specific pass for Fortran,
52- OMPDescriptorMapInfoGenPass (Optimizer/OMPDescriptorMapInfoGen.cpp) will expand the
53- omp.MapInfoOp's containing descriptors (which currently will be a BoxType or BoxAddrOp) into multiple
47+ ` OMPDescriptorMapInfoGenPass` (Optimizer/OMPDescriptorMapInfoGen.cpp) will expand the
48+ ` omp.MapInfoOp` 's containing descriptors (which currently will be a ` BoxType` or ` BoxAddrOp` ) into multiple
5449mappings, with one extra per pointer member in the descriptor that is supported on top of the original
5550descriptor map operation. These pointers members are linked to the parent descriptor by adding them to
5651the member field of the original descriptor map operation, they are then inserted into the relevant map
57- owning operation's (omp.TargetOp, omp.DataOp etc.) map operand list and in cases where the owning operation
58- is IsolatedFromAbove, it also inserts them as BlockArgs to canonicalize the mappings and simplify lowering.
52+ owning operation's (` omp.TargetOp`, ` omp.DataOp` etc.) map operand list and in cases where the owning operation
53+ is ` IsolatedFromAbove` , it also inserts them as ` BlockArgs` to canonicalize the mappings and simplify lowering.
5954
60- An example transformation by the OMPDescriptorMapInfoGenPass:
55+ An example transformation by the ` OMPDescriptorMapInfoGenPass` :
6156
6257```
6358
@@ -83,13 +78,48 @@ omp.target map_entries(%13 -> %arg1, %14 -> %arg2, %15 -> %arg3 : !fir.llvm_ptr<
8378
8479In later stages of the compilation flow when the OpenMP dialect is being lowered to LLVM-IR these descriptor
8580mappings are treated as if they were structure mappings with explicit member maps on the same directive as
86- their parent was mapped.
87-
88- This method is generic in the sense that the OpenMP diaelct doesn't need to understand that it is mapping a
81+ their parent was mapped.
82+
83+ This implementation utilises the member field of the `map_info` operation to indicate that the pointer
84+ descriptor elements which are contained in their own `map_info` operation are part of their respective
85+ parent descriptor. This allows the descriptor containing the descriptor pointer member to be mapped
86+ as a composite entity during lowering, with the correct mappings being generated to tie them together,
87+ allowing the OpenMP runtime to map them correctly, attaching the pointer member to the parent
88+ structure so it can be accessed during execution. If we opt to not treat the descriptor as a single
89+ entity we have issues with the member being correctly attached to the parent and being accessible,
90+ this can cause runtime segfaults on the device when we try to access the data through the parent. It
91+ may be possible to avoid this member mapping, treating them as individual entities, but treating a
92+ composite mapping as an individual mapping could lead to problems such as the runtime taking
93+ liberties with the mapping it usually wouldn't if it knew they were linked, we would also have to
94+ be careful to maintian the correct order of mappings as we lower, if we misorder the maps, it'd be
95+ possible to overwrite already written data, e.g. if we write the descriptor data pointer first, and
96+ then the containing descriptor, we would overwrite the descriptor data pointer with the incorrect
97+ address.
98+
99+ This method is generic in the sense that the OpenMP dialect doesn't need to understand that it is mapping a
89100Fortran type containing a descriptor, it just thinks it's a record type from either Fortran or C++. However,
90101it is a little rigid in how the descriptor mappings are handled as there is no specialisation or possibility
91- to specialise the mappings for possible edge cases without poluting the dialect or lowering with further
92- knowledge of Fortran and the FIR dialect. In the case that this kind of specialisation is required or
93- desired then the methodology described by OpenACC which utilises runtime functions to handle specialised mappings
94- for dialects may be a more desirable approach to move towards. For the moment this method appears sufficient as
95- far as the OpenMP specification and current testing can show.
102+ to specialise the mappings for possible edge cases without polluting the dialect or lowering with further
103+ knowledge of Fortran and the FIR dialect.
104+
105+ # OpenMP dialect differences from OpenACC dialect
106+
107+ The descriptor mapping for OpenMP currently works differently to the planned direction for OpenACC, however,
108+ it is possible and would likely be ideal to align the method with OpenACC in the future.
109+
110+ Currently the OpenMP specification is less descriptive and has less stringent rules around descriptor based
111+ types so does not require as complex a set of descriptor management rules as OpenACC (although, in certain
112+ cases for the interim adopting OpenACC's rules where it makes sense could be useful). To handle the more
113+ complex descriptor mapping rules OpenACC has opted to utilise a more runtime oriented approach, where
114+ specialized runtime functions for handling descriptor mapping for OpenACC are created and these runtime
115+ function handles are attatched to a special OpenACC dialect operation. When this operation is lowered it
116+ will lower to the attatched OpenACC descriptor mapping runtime function. This sounds like it will work
117+ (no implementation yet) similarly to some of the existing HLFIR operations which optionally lower to
118+ Fortran runtime calls.
119+
120+ This methodology described by OpenACC which utilises runtime functions to handle specialised mappings allows
121+ more flexibility as a significant amount of the mapping logic can be moved into the runtime from the compiler.
122+ It also allows specialisation of the mapping for fortran specific types. This may be a desireable approach
123+ to take for OpenMP in the future, in particular if we find need to specialise mapping further for
124+ descriptors or other Fortran types. However, for the moment the currently chosen implementation for OpenMP
125+ appears sufficient as far as the OpenMP specification and current testing can show.
0 commit comments