Skip to content

Commit f4d56d6

Browse files
committed
[Flang][MLIR][OpenMP] Update PR based on recent reviewer comments
1 parent 1da9dd8 commit f4d56d6

21 files changed

+214
-327
lines changed

flang/docs/OpenMP-descriptor-management.md

Lines changed: 50 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -8,17 +8,12 @@
88

99
# OpenMP dialect: Fortran descriptor type mapping for offload
1010

11-
The descriptor mapping for OpenMP currently works differently to the planned direction for OpenACC, however,
12-
it is possible and would likely be ideal to align the method with OpenACC in the future. However, at least
13-
currently the OpenMP specification is less descriptive and has less stringent rules around descriptor based
14-
types so does not require as complex a set of descriptor management rules (although, in certain cases
15-
for the interim adopting OpenACC's rules where it makes sense could be useful).
16-
1711
The initial method for mapping Fortran types tied to descriptors for OpenMP offloading is to treat these types
1812
as a special case of OpenMP record type (C/C++ structure/class, Fortran derived type etc.) mapping as far as the
1913
runtime is concerned. Where the box (descriptor information) is the holding container and the underlying
2014
data pointer is contained within the container, and we must generate explicit maps for both the pointer member and
21-
the container. As an example, a small C++ program that is equivalent to the concept described:
15+
the container. As an example, a small C++ program that is equivalent to the concept described, with the
16+
`mock_descriptor` class being representative of the class utilised for descriptors in Clang:
2217

2318
```C++
2419
struct mock_descriptor {
@@ -49,15 +44,15 @@ Currently, Flang will lower these descriptor types in the OpenMP lowering (lower
4944
to all other map types, generating an omp.MapInfoOp containing relevant information required for lowering
5045
the OpenMP dialect to LLVM-IR during the final stages of the MLIR lowering. However, after
5146
the lowering to FIR/HLFIR has been performed an OpenMP dialect specific pass for Fortran,
52-
OMPDescriptorMapInfoGenPass (Optimizer/OMPDescriptorMapInfoGen.cpp) will expand the
53-
omp.MapInfoOp's containing descriptors (which currently will be a BoxType or BoxAddrOp) into multiple
47+
`OMPDescriptorMapInfoGenPass` (Optimizer/OMPDescriptorMapInfoGen.cpp) will expand the
48+
`omp.MapInfoOp`'s containing descriptors (which currently will be a `BoxType` or `BoxAddrOp`) into multiple
5449
mappings, with one extra per pointer member in the descriptor that is supported on top of the original
5550
descriptor map operation. These pointers members are linked to the parent descriptor by adding them to
5651
the member field of the original descriptor map operation, they are then inserted into the relevant map
57-
owning operation's (omp.TargetOp, omp.DataOp etc.) map operand list and in cases where the owning operation
58-
is IsolatedFromAbove, it also inserts them as BlockArgs to canonicalize the mappings and simplify lowering.
52+
owning operation's (`omp.TargetOp`, `omp.DataOp` etc.) map operand list and in cases where the owning operation
53+
is `IsolatedFromAbove`, it also inserts them as `BlockArgs` to canonicalize the mappings and simplify lowering.
5954
60-
An example transformation by the OMPDescriptorMapInfoGenPass:
55+
An example transformation by the `OMPDescriptorMapInfoGenPass`:
6156
6257
```
6358

@@ -83,13 +78,48 @@ omp.target map_entries(%13 -> %arg1, %14 -> %arg2, %15 -> %arg3 : !fir.llvm_ptr<
8378
8479
In later stages of the compilation flow when the OpenMP dialect is being lowered to LLVM-IR these descriptor
8580
mappings are treated as if they were structure mappings with explicit member maps on the same directive as
86-
their parent was mapped.
87-
88-
This method is generic in the sense that the OpenMP diaelct doesn't need to understand that it is mapping a
81+
their parent was mapped.
82+
83+
This implementation utilises the member field of the `map_info` operation to indicate that the pointer
84+
descriptor elements which are contained in their own `map_info` operation are part of their respective
85+
parent descriptor. This allows the descriptor containing the descriptor pointer member to be mapped
86+
as a composite entity during lowering, with the correct mappings being generated to tie them together,
87+
allowing the OpenMP runtime to map them correctly, attaching the pointer member to the parent
88+
structure so it can be accessed during execution. If we opt to not treat the descriptor as a single
89+
entity we have issues with the member being correctly attached to the parent and being accessible,
90+
this can cause runtime segfaults on the device when we try to access the data through the parent. It
91+
may be possible to avoid this member mapping, treating them as individual entities, but treating a
92+
composite mapping as an individual mapping could lead to problems such as the runtime taking
93+
liberties with the mapping it usually wouldn't if it knew they were linked, we would also have to
94+
be careful to maintian the correct order of mappings as we lower, if we misorder the maps, it'd be
95+
possible to overwrite already written data, e.g. if we write the descriptor data pointer first, and
96+
then the containing descriptor, we would overwrite the descriptor data pointer with the incorrect
97+
address.
98+
99+
This method is generic in the sense that the OpenMP dialect doesn't need to understand that it is mapping a
89100
Fortran type containing a descriptor, it just thinks it's a record type from either Fortran or C++. However,
90101
it is a little rigid in how the descriptor mappings are handled as there is no specialisation or possibility
91-
to specialise the mappings for possible edge cases without poluting the dialect or lowering with further
92-
knowledge of Fortran and the FIR dialect. In the case that this kind of specialisation is required or
93-
desired then the methodology described by OpenACC which utilises runtime functions to handle specialised mappings
94-
for dialects may be a more desirable approach to move towards. For the moment this method appears sufficient as
95-
far as the OpenMP specification and current testing can show.
102+
to specialise the mappings for possible edge cases without polluting the dialect or lowering with further
103+
knowledge of Fortran and the FIR dialect.
104+
105+
# OpenMP dialect differences from OpenACC dialect
106+
107+
The descriptor mapping for OpenMP currently works differently to the planned direction for OpenACC, however,
108+
it is possible and would likely be ideal to align the method with OpenACC in the future.
109+
110+
Currently the OpenMP specification is less descriptive and has less stringent rules around descriptor based
111+
types so does not require as complex a set of descriptor management rules as OpenACC (although, in certain
112+
cases for the interim adopting OpenACC's rules where it makes sense could be useful). To handle the more
113+
complex descriptor mapping rules OpenACC has opted to utilise a more runtime oriented approach, where
114+
specialized runtime functions for handling descriptor mapping for OpenACC are created and these runtime
115+
function handles are attatched to a special OpenACC dialect operation. When this operation is lowered it
116+
will lower to the attatched OpenACC descriptor mapping runtime function. This sounds like it will work
117+
(no implementation yet) similarly to some of the existing HLFIR operations which optionally lower to
118+
Fortran runtime calls.
119+
120+
This methodology described by OpenACC which utilises runtime functions to handle specialised mappings allows
121+
more flexibility as a significant amount of the mapping logic can be moved into the runtime from the compiler.
122+
It also allows specialisation of the mapping for fortran specific types. This may be a desireable approach
123+
to take for OpenMP in the future, in particular if we find need to specialise mapping further for
124+
descriptors or other Fortran types. However, for the moment the currently chosen implementation for OpenMP
125+
appears sufficient as far as the OpenMP specification and current testing can show.

flang/include/flang/Optimizer/CodeGen/CodeGenOpenMP.h

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,4 @@
1-
//=== Optimizer/CodeGen/CodeGenOpenMP.h - OpenMP code generation -*- C++
2-
//-*-===//
1+
//===------- Optimizer/CodeGen/CodeGenOpenMP.h - OpenMP codegen -*- C++ -*-===//
32
//
43
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
54
// See https://llvm.org/LICENSE.txt for license information.
@@ -24,4 +23,4 @@ void populateOpenMPFIRToLLVMConversionPatterns(
2423

2524
} // namespace fir
2625

27-
#endif // FORTRAN_OPTIMIZER_CODEGEN_CODEGENOPENMP_H
26+
#endif // FORTRAN_OPTIMIZER_CODEGEN_CODEGENOPENMP_H

flang/include/flang/Optimizer/Transforms/Passes.td

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -322,8 +322,9 @@ def OMPDescriptorMapInfoGenPass
322322
: Pass<"omp-descriptor-map-info-gen", "mlir::ModuleOp"> {
323323
let summary = "expands OpenMP MapInfo operations containing descriptors";
324324
let description = [{
325-
Expands MapInfo operations containing descriptor types into multiple MapInfo's for each pointer element in
326-
the descriptor that requires explicit individual mapping by the OpenMP runtime.
325+
Expands MapInfo operations containing descriptor types into multiple
326+
MapInfo's for each pointer element in the descriptor that requires
327+
explicit individual mapping by the OpenMP runtime.
327328
}];
328329
let constructor = "::fir::createOMPDescriptorMapInfoGenPass()";
329330
let dependentDialects = ["mlir::omp::OpenMPDialect"];

flang/lib/Optimizer/CodeGen/CMakeLists.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
add_flang_library(FIRCodeGen
2-
CodeGenOpenMP.cpp
32
BoxedProcedure.cpp
43
CGOps.cpp
54
CodeGen.cpp
5+
CodeGenOpenMP.cpp
66
PreCGRewrite.cpp
77
TBAABuilder.cpp
88
Target.cpp

flang/lib/Optimizer/Transforms/OMPDescriptorMapInfoGen.cpp

Lines changed: 24 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,6 @@ class OMPDescriptorMapInfoGenPass
2727
void genDescriptorMemberMaps(mlir::omp::MapInfoOp op,
2828
fir::FirOpBuilder &builder,
2929
mlir::Operation *target) {
30-
llvm::SmallVector<mlir::Value> descriptorBaseAddrMembers;
3130
mlir::Location loc = builder.getUnknownLoc();
3231
mlir::Value descriptor = op.getVarPtr();
3332

@@ -60,7 +59,8 @@ class OMPDescriptorMapInfoGenPass
6059
mlir::Value baseAddrAddr = builder.create<fir::BoxOffsetOp>(
6160
loc, descriptor, fir::BoxFieldAttr::base_addr);
6261

63-
descriptorBaseAddrMembers.push_back(builder.create<mlir::omp::MapInfoOp>(
62+
// Member of the descriptor pointing at the allocated data
63+
mlir::Value baseAddr = builder.create<mlir::omp::MapInfoOp>(
6464
loc, baseAddrAddr.getType(), baseAddrAddr,
6565
llvm::cast<mlir::omp::PointerLikeType>(
6666
fir::unwrapRefType(baseAddrAddr.getType()))
@@ -70,38 +70,36 @@ class OMPDescriptorMapInfoGenPass
7070
op.getMapType().value()),
7171
builder.getAttr<mlir::omp::VariableCaptureKindAttr>(
7272
mlir::omp::VariableCaptureKind::ByRef),
73-
builder.getStringAttr("")));
73+
builder.getStringAttr("") /*name*/);
7474

7575
// TODO: map the addendum segment of the descriptor, similarly to the
7676
// above base address/data pointer member.
7777

7878
op.getVarPtrMutable().assign(descriptor);
7979
op.setVarType(fir::unwrapRefType(descriptor.getType()));
80-
op.getMembersMutable().append(descriptorBaseAddrMembers);
80+
op.getMembersMutable().append(baseAddr);
8181
op.getBoundsMutable().assign(llvm::SmallVector<mlir::Value>{});
8282

8383
if (auto mapClauseOwner =
8484
llvm::dyn_cast<mlir::omp::MapClauseOwningOpInterface>(target)) {
8585
llvm::SmallVector<mlir::Value> newMapOps;
86-
for (size_t i = 0; i < mapClauseOwner.getMapOperands().size(); ++i) {
87-
if (mapClauseOwner.getMapOperands()[i] == op) {
86+
mlir::OperandRange mapOperandsArr = mapClauseOwner.getMapOperands();
87+
88+
for (size_t i = 0; i < mapOperandsArr.size(); ++i) {
89+
if (mapOperandsArr[i] == op) {
8890
// Push new implicit maps generated for the descriptor.
89-
newMapOps.push_back(descriptorBaseAddrMembers[0]);
91+
newMapOps.push_back(baseAddr);
9092

9193
// for TargetOp's which have IsolatedFromAbove we must align the
9294
// new additional map operand with an appropriate BlockArgument,
9395
// as the printing and later processing currently requires a 1:1
9496
// mapping of BlockArgs to MapInfoOp's at the same placement in
9597
// each array (BlockArgs and MapOperands).
96-
if (auto targetOp = llvm::dyn_cast<mlir::omp::TargetOp>(target)) {
97-
targetOp.getRegion().insertArgument(
98-
i, descriptorBaseAddrMembers[0].getType(), loc);
99-
}
100-
101-
newMapOps.push_back(mapClauseOwner.getMapOperands()[i]);
102-
} else {
103-
newMapOps.push_back(mapClauseOwner.getMapOperands()[i]);
98+
if (auto targetOp = llvm::dyn_cast<mlir::omp::TargetOp>(target))
99+
targetOp.getRegion().insertArgument(i, baseAddr.getType(), loc);
104100
}
101+
102+
newMapOps.push_back(mapOperandsArr[i]);
105103
}
106104

107105
mapClauseOwner.getMapOperandsMutable().assign(newMapOps);
@@ -121,8 +119,17 @@ class OMPDescriptorMapInfoGenPass
121119
mlir::isa_and_present<fir::BoxAddrOp>(
122120
op.getVarPtr().getDefiningOp())) {
123121
builder.setInsertionPoint(op);
124-
// Currently a MapInfoOp argument can only show up on a single target
125-
// user so we can retrieve and use the first user.
122+
// TODO: Currently only supports a single user for the MapInfoOp, this
123+
// is fine for the moment as the Fortran Frontend will generate a
124+
// new MapInfoOp per Target operation for the moment. However, when/if
125+
// we optimise/cleanup the IR, it likely isn't too difficult to
126+
// extend this function, it would require some modification to create a
127+
// single new MapInfoOp per new MapInfoOp generated and share it across
128+
// all users appropriately, making sure to only add a single member link
129+
// per new generation for the original originating descriptor MapInfoOp.
130+
assert(llvm::hasSingleElement(op->getUsers()) &&
131+
"OMPDescriptorMapInfoGen currently only supports single users "
132+
"of a MapInfoOp");
126133
genDescriptorMemberMaps(op, builder, *op->getUsers().begin());
127134
}
128135
});

flang/test/Fir/convert-to-llvm-openmp-and-fir.fir

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -893,3 +893,22 @@ func.func @omp_critical_() {
893893
}
894894
return
895895
}
896+
897+
// -----
898+
899+
// CHECK-LABEL: llvm.func @omp_map_info_descriptor_type_conversion
900+
// CHECK-SAME: %[[ARG_0:.*]]: !llvm.ptr)
901+
902+
func.func @omp_map_info_descriptor_type_conversion(%arg0 : !fir.ref<!fir.box<!fir.heap<i32>>>) {
903+
// CHECK: %[[GEP:.*]] = llvm.getelementptr %[[ARG_0]][0, 0] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
904+
%0 = fir.box_offset %arg0 base_addr : (!fir.ref<!fir.box<!fir.heap<i32>>>) -> !fir.llvm_ptr<!fir.ref<i32>>
905+
// CHECK: %[[MEMBER_MAP:.*]] = omp.map_info var_ptr(%[[GEP]] : !llvm.ptr, i32) map_clauses(tofrom) capture(ByRef) -> !llvm.ptr {name = ""}
906+
%1 = omp.map_info var_ptr(%0 : !fir.llvm_ptr<!fir.ref<i32>>, i32) map_clauses(tofrom) capture(ByRef) -> !fir.llvm_ptr<!fir.ref<i32>> {name = ""}
907+
// CHECK: %[[DESC_MAP:.*]] = omp.map_info var_ptr(%[[ARG_0]] : !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>) map_clauses(always, delete) capture(ByRef) members(%[[MEMBER_MAP]] : !llvm.ptr) -> !llvm.ptr {name = ""}
908+
%2 = omp.map_info var_ptr(%arg0 : !fir.ref<!fir.box<!fir.heap<i32>>>, !fir.box<!fir.heap<i32>>) map_clauses(always, delete) capture(ByRef) members(%1 : !fir.llvm_ptr<!fir.ref<i32>>) -> !fir.ref<!fir.box<!fir.heap<i32>>> {name = ""}
909+
// CHECK: omp.target_exit_data map_entries(%[[DESC_MAP]] : !llvm.ptr)
910+
omp.target_exit_data map_entries(%2 : !fir.ref<!fir.box<!fir.heap<i32>>>)
911+
return
912+
}
913+
914+
// -----

flang/test/Integration/OpenMP/map-types-and-sizes.f90

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -109,4 +109,4 @@ end subroutine mapType_char
109109
!CHECK: %[[SIZE_DIFF:.*]] = sub i64 %[[ALLOCA_GEP_INT]], %[[ALLOCA_INT]]
110110
!CHECK: %[[DIV:.*]] = sdiv exact i64 %[[SIZE_DIFF]], ptrtoint (ptr getelementptr (i8, ptr null, i32 1) to i64)
111111
!CHECK: %[[OFFLOAD_SIZE_ARR:.*]] = getelementptr inbounds [3 x i64], ptr %.offload_sizes, i32 0, i32 0
112-
!CHECK: store i64 %[[DIV]], ptr %[[OFFLOAD_SIZE_ARR]], align 8
112+
!CHECK: store i64 %[[DIV]], ptr %[[OFFLOAD_SIZE_ARR]], align 8

flang/test/Lower/OpenMP/allocatable-array-bounds.f90

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -114,4 +114,4 @@ subroutine call_assumed_shape_and_size_array
114114
allocate(arr_read_write(20))
115115
call assumed_size_array(arr_read_write(10:20))
116116
deallocate(arr_read_write)
117-
end subroutine call_assumed_shape_and_size_array
117+
end subroutine call_assumed_shape_and_size_array
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
// RUN: fir-opt --omp-descriptor-map-info-gen %s | FileCheck %s
2+
3+
module attributes {omp.is_target_device = false} {
4+
func.func @test_descriptor_expansion_pass() {
5+
%0 = fir.alloca !fir.box<!fir.heap<i32>> {bindc_name = "test", uniq_name = "_QFEtest"}
6+
%1 = fir.zero_bits !fir.heap<i32>
7+
%2 = fir.embox %1 : (!fir.heap<i32>) -> !fir.box<!fir.heap<i32>>
8+
fir.store %2 to %0 : !fir.ref<!fir.box<!fir.heap<i32>>>
9+
%3:2 = hlfir.declare %0 {fortran_attrs = #fir.var_attrs<allocatable>, uniq_name = "_QFEtest"} : (!fir.ref<!fir.box<!fir.heap<i32>>>) -> (!fir.ref<!fir.box<!fir.heap<i32>>>, !fir.ref<!fir.box<!fir.heap<i32>>>)
10+
%4 = fir.allocmem i32 {fir.must_be_heap = true, uniq_name = "_QFEtest.alloc"}
11+
%5 = fir.embox %4 : (!fir.heap<i32>) -> !fir.box<!fir.heap<i32>>
12+
fir.store %5 to %3#1 : !fir.ref<!fir.box<!fir.heap<i32>>>
13+
%6 = omp.map_info var_ptr(%3#1 : !fir.ref<!fir.box<!fir.heap<i32>>>, !fir.box<!fir.heap<i32>>) map_clauses(tofrom) capture(ByRef) -> !fir.ref<!fir.box<!fir.heap<i32>>> {name = "test"}
14+
omp.target map_entries(%6 -> %arg0 : !fir.ref<!fir.box<!fir.heap<i32>>>) {
15+
^bb0(%arg0: !fir.ref<!fir.box<!fir.heap<i32>>>):
16+
omp.terminator
17+
}
18+
return
19+
}
20+
}
21+
22+
// CHECK: %[[DECLARE:.*]]:2 = hlfir.declare %{{.*}} {fortran_attrs = #fir.var_attrs<allocatable>, uniq_name = "_QFEtest"} : (!fir.ref<!fir.box<!fir.heap<i32>>>) -> (!fir.ref<!fir.box<!fir.heap<i32>>>, !fir.ref<!fir.box<!fir.heap<i32>>>)
23+
// CHECK: %[[BASE_ADDR_OFF:.*]] = fir.box_offset %[[DECLARE]]#1 base_addr : (!fir.ref<!fir.box<!fir.heap<i32>>>) -> !fir.llvm_ptr<!fir.ref<i32>>
24+
// CHECK: %[[DESC_MEMBER_MAP:.*]] = omp.map_info var_ptr(%[[BASE_ADDR_OFF]] : !fir.llvm_ptr<!fir.ref<i32>>, i32) map_clauses(tofrom) capture(ByRef) -> !fir.llvm_ptr<!fir.ref<i32>> {name = ""}
25+
// CHECK: %[[DESC_PARENT_MAP:.*]] = omp.map_info var_ptr(%[[DECLARE]]#1 : !fir.ref<!fir.box<!fir.heap<i32>>>, !fir.box<!fir.heap<i32>>) map_clauses(tofrom) capture(ByRef) members(%[[DESC_MEMBER_MAP]] : !fir.llvm_ptr<!fir.ref<i32>>) -> !fir.ref<!fir.box<!fir.heap<i32>>> {name = "test"}
26+
// CHECK: omp.target map_entries(%[[DESC_MEMBER_MAP]] -> %{{.*}}, %[[DESC_PARENT_MAP]] -> %{{.*}} : !fir.llvm_ptr<!fir.ref<i32>>, !fir.ref<!fir.box<!fir.heap<i32>>>) {
27+
// CHECK: ^bb0(%{{.*}}: !fir.llvm_ptr<!fir.ref<i32>>, %{{.*}}: !fir.ref<!fir.box<!fir.heap<i32>>>):

0 commit comments

Comments
 (0)