Skip to content

Commit 8f12e5f

Browse files
kripkenradekdoulik
authored andcommitted
Fix global effect computation with -O flags (WebAssembly#6211)
We tested --generate-global-effects --vacuum and such, but not --generate-global-effects -O3 or the other -O flags. Unfortunately, our targeted testing missed a bug because of that. Specifically, we have special logic for -O flags to make sure the passes they expand into run with the proper opt and shrink levels, but that logic happened to also interfere with global effect computation. It would also interfere with allowing GUFA info or other things to be stored on the side, which we've proposed. This PR fixes that + future issues. The fix is to just allow a pass runner to execute more than once. We thought to avoid that and assert against it to keep the model "hermetic" (you create a pass runner, you run the passes, and you throw it out), which feels nice in a way, but it led to the bug here, and I'm not sure it would prevent any other ones really. It is also more code. It is simpler to allow a runner to execute more than once, and add a method to clear it. With that, the logic for -O3 execution is both simpler and does not interfere with anything but the opt and shrink level flags: we create a single runner, give it the proper options, and then keep using that runner + those options as we go, normally.
1 parent a11727f commit 8f12e5f

4 files changed

Lines changed: 229 additions & 29 deletions

File tree

src/pass.h

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -235,8 +235,9 @@ struct PassOptions {
235235
// other passes later can benefit from it. It is up to the sequence of passes
236236
// to update or discard this when necessary - in particular, when new effects
237237
// are added to a function this must be changed or we may optimize
238-
// incorrectly (however, it is extremely rare for a pass to *add* effects;
239-
// passes normally only remove effects).
238+
// incorrectly. However, it is extremely rare for a pass to *add* effects;
239+
// passes normally only remove effects. Passes that do add effects must set
240+
// addsEffects() so the pass runner is aware of them.
240241
std::shared_ptr<FuncEffectsMap> funcEffectsMap;
241242

242243
// -Os is our default
@@ -318,6 +319,9 @@ struct PassRunner {
318319
// Add a pass given an instance.
319320
void add(std::unique_ptr<Pass> pass) { doAdd(std::move(pass)); }
320321

322+
// Clears away all passes that have been added.
323+
void clear();
324+
321325
// Adds the pass if there are no DWARF-related issues. There is an issue if
322326
// there is DWARF and if the pass does not support DWARF (as defined by the
323327
// pass returning true from invalidatesDWARF); otherwise, if there is no
@@ -387,9 +391,6 @@ struct PassRunner {
387391
// yet) have removed DWARF.
388392
bool addedPassesRemovedDWARF = false;
389393

390-
// Whether this pass runner has run. A pass runner should only be run once.
391-
bool ran = false;
392-
393394
void runPass(Pass* pass);
394395
void runPassOnFunction(Pass* pass, Function* func);
395396

src/passes/pass.cpp

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -739,9 +739,6 @@ static void dumpWasm(Name name, Module* wasm) {
739739
}
740740

741741
void PassRunner::run() {
742-
assert(!ran);
743-
ran = true;
744-
745742
static const int passDebug = getPassDebug();
746743
// Emit logging information when asked for. At passDebug level 1+ we log
747744
// the main passes, while in 2 we also log nested ones. Note that for
@@ -885,6 +882,8 @@ void PassRunner::doAdd(std::unique_ptr<Pass> pass) {
885882
passes.emplace_back(std::move(pass));
886883
}
887884

885+
void PassRunner::clear() { passes.clear(); }
886+
888887
// Checks that the state is valid before and after a
889888
// pass runs on a function. We run these extra checks when
890889
// pass-debug mode is enabled.

src/tools/optimization-options.h

Lines changed: 23 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -328,52 +328,54 @@ struct OptimizationOptions : public ToolOptions {
328328
bool runningPasses() { return passes.size() > 0; }
329329

330330
void runPasses(Module& wasm) {
331-
std::unique_ptr<PassRunner> passRunner;
331+
PassRunner passRunner(&wasm, passOptions);
332+
if (debug) {
333+
passRunner.setDebug(true);
334+
}
332335

333336
// Flush anything in the current pass runner, and then reset it to a fresh
334337
// state so it is ready for new things.
335-
auto flushAndReset = [&]() {
336-
if (passRunner) {
337-
passRunner->run();
338-
}
339-
passRunner = std::make_unique<PassRunner>(&wasm, passOptions);
340-
if (debug) {
341-
passRunner->setDebug(true);
342-
}
338+
auto flush = [&]() {
339+
passRunner.run();
340+
passRunner.clear();
343341
};
344342

345-
flushAndReset();
346-
347343
for (auto& pass : passes) {
348344
if (pass.name == DEFAULT_OPT_PASSES) {
349345
// This is something like -O3 or -Oz. We must run this now, in order to
350346
// set the proper opt and shrink levels. To do that, first reset the
351347
// runner so that anything already queued is run (since we can only run
352348
// after those things).
353-
flushAndReset();
349+
flush();
354350

355351
// -O3/-Oz etc. always set their own optimize/shrinkLevels.
356352
assert(pass.optimizeLevel);
357353
assert(pass.shrinkLevel);
358-
passRunner->options.optimizeLevel = *pass.optimizeLevel;
359-
passRunner->options.shrinkLevel = *pass.shrinkLevel;
360354

361-
// Run our optimizations now, and reset the runner so that the default
362-
// pass options are used later (and not the temporary optimize/
363-
// shrinkLevels we just set).
364-
passRunner->addDefaultOptimizationPasses();
365-
flushAndReset();
355+
// Temporarily override the default levels.
356+
assert(passRunner.options.optimizeLevel == passOptions.optimizeLevel);
357+
assert(passRunner.options.shrinkLevel == passOptions.shrinkLevel);
358+
passRunner.options.optimizeLevel = *pass.optimizeLevel;
359+
passRunner.options.shrinkLevel = *pass.shrinkLevel;
360+
361+
// Run our optimizations now with the custom levels.
362+
passRunner.addDefaultOptimizationPasses();
363+
flush();
364+
365+
// Restore the default optimize/shrinkLevels.
366+
passRunner.options.optimizeLevel = passOptions.optimizeLevel;
367+
passRunner.options.shrinkLevel = passOptions.shrinkLevel;
366368
} else {
367369
// This is a normal pass. Add it to the queue for execution.
368-
passRunner->add(pass.name);
370+
passRunner.add(pass.name);
369371

370372
// Normal passes do not set their own optimize/shrinkLevels.
371373
assert(!pass.optimizeLevel);
372374
assert(!pass.shrinkLevel);
373375
}
374376
}
375377

376-
flushAndReset();
378+
flush();
377379
}
378380
};
379381

Lines changed: 198 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,198 @@
1+
;; NOTE: Assertions have been generated by update_lit_checks.py --all-items and should not be edited.
2+
3+
;; RUN: foreach %s %t wasm-opt -all -S -o - --generate-global-effects | filecheck %s --check-prefix CHECK_0
4+
;; RUN: foreach %s %t wasm-opt -all -S -o - --generate-global-effects -O1 | filecheck %s --check-prefix CHECK_1
5+
;; RUN: foreach %s %t wasm-opt -all -S -o - --generate-global-effects -O3 | filecheck %s --check-prefix CHECK_3
6+
;; RUN: foreach %s %t wasm-opt -all -S -o - --generate-global-effects -Os | filecheck %s --check-prefix CHECK_s
7+
;; RUN: foreach %s %t wasm-opt -all -S -o - --generate-global-effects -O | filecheck %s --check-prefix CHECK_O
8+
9+
;; Test that global effects benefit -O1 and related modes.
10+
11+
(module
12+
;; CHECK_0: (type $0 (func))
13+
14+
;; CHECK_0: (type $1 (func (result i32)))
15+
16+
;; CHECK_0: (export "main" (func $main))
17+
;; CHECK_1: (type $0 (func))
18+
19+
;; CHECK_1: (type $1 (func (result i32)))
20+
21+
;; CHECK_1: (export "main" (func $main))
22+
;; CHECK_3: (type $0 (func))
23+
24+
;; CHECK_3: (type $1 (func (result i32)))
25+
26+
;; CHECK_3: (export "main" (func $main))
27+
;; CHECK_s: (type $0 (func))
28+
29+
;; CHECK_s: (type $1 (func (result i32)))
30+
31+
;; CHECK_s: (export "main" (func $main))
32+
;; CHECK_O: (type $0 (func))
33+
34+
;; CHECK_O: (type $1 (func (result i32)))
35+
36+
;; CHECK_O: (export "main" (func $main))
37+
(export "main" (func $main))
38+
39+
;; CHECK_0: (export "pointless-work" (func $pointless-work))
40+
;; CHECK_1: (export "pointless-work" (func $pointless-work))
41+
;; CHECK_3: (export "pointless-work" (func $pointless-work))
42+
;; CHECK_s: (export "pointless-work" (func $pointless-work))
43+
;; CHECK_O: (export "pointless-work" (func $pointless-work))
44+
(export "pointless-work" (func $pointless-work))
45+
46+
;; CHECK_0: (func $main (type $0)
47+
;; CHECK_0-NEXT: (if
48+
;; CHECK_0-NEXT: (call $pointless-work)
49+
;; CHECK_0-NEXT: (then
50+
;; CHECK_0-NEXT: (drop
51+
;; CHECK_0-NEXT: (call $pointless-work)
52+
;; CHECK_0-NEXT: )
53+
;; CHECK_0-NEXT: )
54+
;; CHECK_0-NEXT: )
55+
;; CHECK_0-NEXT: )
56+
;; CHECK_1: (func $main (type $0)
57+
;; CHECK_1-NEXT: (nop)
58+
;; CHECK_1-NEXT: )
59+
;; CHECK_3: (func $main (type $0) (; has Stack IR ;)
60+
;; CHECK_3-NEXT: (nop)
61+
;; CHECK_3-NEXT: )
62+
;; CHECK_s: (func $main (type $0) (; has Stack IR ;)
63+
;; CHECK_s-NEXT: (nop)
64+
;; CHECK_s-NEXT: )
65+
;; CHECK_O: (func $main (type $0) (; has Stack IR ;)
66+
;; CHECK_O-NEXT: (nop)
67+
;; CHECK_O-NEXT: )
68+
(func $main
69+
;; This calls a function that does pointless work. After generating global
70+
;; effects we can see that it is pointless and remove this entire if (except
71+
;; for -O0).
72+
(if
73+
(call $pointless-work)
74+
(then
75+
(drop
76+
(call $pointless-work)
77+
)
78+
)
79+
)
80+
)
81+
82+
;; CHECK_0: (func $pointless-work (type $1) (result i32)
83+
;; CHECK_0-NEXT: (local $x i32)
84+
;; CHECK_0-NEXT: (loop $loop
85+
;; CHECK_0-NEXT: (local.set $x
86+
;; CHECK_0-NEXT: (i32.add
87+
;; CHECK_0-NEXT: (local.get $x)
88+
;; CHECK_0-NEXT: (i32.const 1)
89+
;; CHECK_0-NEXT: )
90+
;; CHECK_0-NEXT: )
91+
;; CHECK_0-NEXT: (if
92+
;; CHECK_0-NEXT: (i32.ge_u
93+
;; CHECK_0-NEXT: (local.get $x)
94+
;; CHECK_0-NEXT: (i32.const 12345678)
95+
;; CHECK_0-NEXT: )
96+
;; CHECK_0-NEXT: (then
97+
;; CHECK_0-NEXT: (return
98+
;; CHECK_0-NEXT: (local.get $x)
99+
;; CHECK_0-NEXT: )
100+
;; CHECK_0-NEXT: )
101+
;; CHECK_0-NEXT: )
102+
;; CHECK_0-NEXT: (br $loop)
103+
;; CHECK_0-NEXT: )
104+
;; CHECK_0-NEXT: )
105+
;; CHECK_1: (func $pointless-work (type $1) (result i32)
106+
;; CHECK_1-NEXT: (local $0 i32)
107+
;; CHECK_1-NEXT: (loop $loop (result i32)
108+
;; CHECK_1-NEXT: (br_if $loop
109+
;; CHECK_1-NEXT: (i32.lt_u
110+
;; CHECK_1-NEXT: (local.tee $0
111+
;; CHECK_1-NEXT: (i32.add
112+
;; CHECK_1-NEXT: (local.get $0)
113+
;; CHECK_1-NEXT: (i32.const 1)
114+
;; CHECK_1-NEXT: )
115+
;; CHECK_1-NEXT: )
116+
;; CHECK_1-NEXT: (i32.const 12345678)
117+
;; CHECK_1-NEXT: )
118+
;; CHECK_1-NEXT: )
119+
;; CHECK_1-NEXT: (local.get $0)
120+
;; CHECK_1-NEXT: )
121+
;; CHECK_1-NEXT: )
122+
;; CHECK_3: (func $pointless-work (type $1) (; has Stack IR ;) (result i32)
123+
;; CHECK_3-NEXT: (local $0 i32)
124+
;; CHECK_3-NEXT: (loop $loop (result i32)
125+
;; CHECK_3-NEXT: (br_if $loop
126+
;; CHECK_3-NEXT: (i32.lt_u
127+
;; CHECK_3-NEXT: (local.tee $0
128+
;; CHECK_3-NEXT: (i32.add
129+
;; CHECK_3-NEXT: (local.get $0)
130+
;; CHECK_3-NEXT: (i32.const 1)
131+
;; CHECK_3-NEXT: )
132+
;; CHECK_3-NEXT: )
133+
;; CHECK_3-NEXT: (i32.const 12345678)
134+
;; CHECK_3-NEXT: )
135+
;; CHECK_3-NEXT: )
136+
;; CHECK_3-NEXT: (local.get $0)
137+
;; CHECK_3-NEXT: )
138+
;; CHECK_3-NEXT: )
139+
;; CHECK_s: (func $pointless-work (type $1) (; has Stack IR ;) (result i32)
140+
;; CHECK_s-NEXT: (local $0 i32)
141+
;; CHECK_s-NEXT: (loop $loop (result i32)
142+
;; CHECK_s-NEXT: (br_if $loop
143+
;; CHECK_s-NEXT: (i32.lt_u
144+
;; CHECK_s-NEXT: (local.tee $0
145+
;; CHECK_s-NEXT: (i32.add
146+
;; CHECK_s-NEXT: (local.get $0)
147+
;; CHECK_s-NEXT: (i32.const 1)
148+
;; CHECK_s-NEXT: )
149+
;; CHECK_s-NEXT: )
150+
;; CHECK_s-NEXT: (i32.const 12345678)
151+
;; CHECK_s-NEXT: )
152+
;; CHECK_s-NEXT: )
153+
;; CHECK_s-NEXT: (local.get $0)
154+
;; CHECK_s-NEXT: )
155+
;; CHECK_s-NEXT: )
156+
;; CHECK_O: (func $pointless-work (type $1) (; has Stack IR ;) (result i32)
157+
;; CHECK_O-NEXT: (local $0 i32)
158+
;; CHECK_O-NEXT: (loop $loop (result i32)
159+
;; CHECK_O-NEXT: (br_if $loop
160+
;; CHECK_O-NEXT: (i32.lt_u
161+
;; CHECK_O-NEXT: (local.tee $0
162+
;; CHECK_O-NEXT: (i32.add
163+
;; CHECK_O-NEXT: (local.get $0)
164+
;; CHECK_O-NEXT: (i32.const 1)
165+
;; CHECK_O-NEXT: )
166+
;; CHECK_O-NEXT: )
167+
;; CHECK_O-NEXT: (i32.const 12345678)
168+
;; CHECK_O-NEXT: )
169+
;; CHECK_O-NEXT: )
170+
;; CHECK_O-NEXT: (local.get $0)
171+
;; CHECK_O-NEXT: )
172+
;; CHECK_O-NEXT: )
173+
(func $pointless-work (result i32)
174+
(local $x i32)
175+
;; Some pointless work, with no side effects, that cannot be inlined. (The
176+
;; changes here are not important for this test.)
177+
(loop $loop
178+
(local.set $x
179+
(i32.add
180+
(local.get $x)
181+
(i32.const 1)
182+
)
183+
)
184+
(if
185+
(i32.ge_u
186+
(local.get $x)
187+
(i32.const 12345678)
188+
)
189+
(then
190+
(return
191+
(local.get $x)
192+
)
193+
)
194+
)
195+
(br $loop)
196+
)
197+
)
198+
)

0 commit comments

Comments
 (0)