fix: Make `nargo::ops::transform_program` idempotent by aakoshh · Pull Request #6695 · noir-lang/noir

aakoshh · 2024-12-04T11:55:11Z

Description

Problem*

Followup for #6694

Summary*

Makes nargo::ops::transform_program idempotent by adding two loops, both up to 3 passes, exiting early if the opcode hash doesn't change:

an outer loop in acvm::compiler::compile that includes optimize_internal and transform_internal
an inner loop in acvm::compiler::transformers::transform_internal

Additional Context

Here's the journey I went through investigating the source of changes across optimisation runs.

Ever increasing `current_witness_index`

On the slice_loop example the Circuit::current_witness_index field increased by 2 after each pass. This seems to be because:

It is increased by the number of intermediate variables created here even though the mutable reference is also increased, hence the double increment
The intermediate variables are (in this case entirely) eliminated later, but the current index isn't adjusted accordingly.

The PR adds a brute force step to visit each opcode and collect the remaining witnesses to set the next one correctly.

Alternatively we could implement PartialEq for Circuit in a way that it ignores current_witness_index.

After this fix we have the following tests still flagging the transformation as non-idempotent:

7_function
conditional_1
fold_fibonacci
hashmap
regression_5252
regression_6451
sha256
sha256_regression
sha256_var_size_regression
sha256_var_witness_const_regression
slices
to_be_bytes

Multiple passes

For the above I found that adding the following to the test makes all of them pass:

                for _ in 0..2 {
                    program_0 = nargo::ops::transform_program(program_0, width);
                }

So two extra full passes to stabilise the optimisation currently. An example where 1 extra pass is not enough was conditional_1. Ideally we would find which part of the transformation needs to be repeated to avoid having to do a full pass; our initial hypothesis was to look at #6668

I had to add a loop around transform_internal allowing 3 passes before the tests indicated more stability than before. The expectation was that we only need to loop around MergeExpressionOptimizer, but that doesn't seem to be true. Even after this, the following two programs fail:

7_function
slices
The difference in both cases is the removal of a range check:

assertion failed: `(left == right)`: optimization not idempotent for test program 'slices''
  left: `"func 0\ncurrent witness index : 617\nprivate parameters indices : [0]\npublic parameters indices : [1]\nreturn value indices : []\nEXPR [ (1, _1) -10 ]\nEXPR [ (-1, _0) (-1, _2) 10 ]\nBRILLIG CALL func 0: in..."` (truncated)
 right: `"func 0\ncurrent witness index : 617\nprivate parameters indices : [0]\npublic parameters indices : [1]\nreturn value indices : []\nEXPR [ (1, _1) -10 ]\nEXPR [ (-1, _0) (-1, _2) 10 ]\nBRILLIG CALL func 0: in..."` (truncated)

Differences (-left|+right):
 BRILLIG CALL func 0: inputs: [Single(Expression { mul_terms: [], linear_combinations: [(1, Witness(13))], q_c: 0 })], outputs: [Simple(Witness(14))]
 EXPR [ (1, _13, _14) (1, _15) -1 ]
 EXPR [ (1, _13, _15) 0 ]
 EXPR [ (1, _7) (-1, _16) 5 ]
-BLACKBOX::RANGE [(16)] [ ]
 EXPR [ (-1, _17) 0 ]
 EXPR [ (1, _0, _4) (-1, _489) 0 ]
 EXPR [ (-10, _4, _8) (-1, _490) 0 ]
 EXPR [ (10, _8) (-1, _18) (1, _489) (1, _490) 0 ]

For these to stabilise had to move the loop to be around the entire transformation that includes the initial backend agnostic optimize_internal as well:

/// Applies [`ProofSystemCompiler`][crate::ProofSystemCompiler] specific optimizations to a [`Circuit`].
pub fn compile<F: AcirField>(
    acir: Circuit<F>,
    expression_width: ExpressionWidth,
) -> (Circuit<F>, AcirTransformationMap) {
    let mut pass = 0;
    let mut prev_acir = acir;

    let (mut acir, acir_opcode_positions) = loop {
        let (acir, acir_opcode_positions) = optimize_internal(prev_acir);
        let (acir, acir_opcode_positions) =
            transform_internal(acir, expression_width, acir_opcode_positions);

        if pass == 2 {
            break (acir, acir_opcode_positions);
        }
        pass += 1;
        prev_acir = acir;
    };

    let transformation_map = AcirTransformationMap::new(&acir_opcode_positions);
    acir.assert_messages = transform_assert_messages(acir.assert_messages, &transformation_map);

    (acir, transformation_map)
}

The final solution has an inner and an outer loop, treating both as black boxes.

Documentation*

Check one:

No documentation needed.
Documentation included in this PR.
[For Experimental Features] Documentation to be submitted in a separate PR.

PR Checklist*

I have tested the changes locally.
I have formatted the changes with Prettier and/or cargo fmt on default settings.

github-actions · 2024-12-04T11:59:20Z

Peak Memory Sample

Program	Peak Memory	%
keccak256	83.22M	0%
workspace	121.82M	-1%
regression_4709	333.68M	0%
ram_blowup_regression	2.65G	0%

TomAFrench

So we need two extra passes to stabilise the optimisation currently. An example were 1 extra pass is not enough was conditional_1.

Performing the full transformation step again is a little overkill. The only non-idempotent optimization is #6668 afaik so if anything we should run multiple instances of this but avoid the rest.

acvm-repo/acvm/src/compiler/transformers/mod.rs

aakoshh · 2024-12-04T14:04:26Z

Performing the full transformation step again is a little overkill.

Agreed, I just tried that as a quick way to see if there is an upper bound now.

aakoshh · 2024-12-04T15:58:15Z

The only non-idempotent optimization is #6668 afaik so if anything we should run multiple instances of this but avoid the rest.

I pushed a commit which allows multiple passes with the MergeExpressionsOptimizer, but despite this another full cycle still changes the results 🤔

github-actions · 2024-12-04T16:01:17Z

Changes to circuit sizes

Generated at commit: f4e1f04a230d77a9a4f95cf19a8ccd56c9bcc841, compared to commit: 73e11c629cac3b1308d1f48ea2db74accf800f97

🧾 Summary (10% most significant diffs)

Program	ACIR opcodes (+/-)	%	Circuit size (+/-)	%
regression_6451	-2 ✅	-12.50%	-1 ✅	-3.57%
conditional_1	+255 ❌	+4.15%	-690 ✅	-4.12%

Full diff report 👇

Program	ACIR opcodes (+/-)	%	Circuit size (+/-)	%
sha256_regression	20,465 (+39)	+0.19%	200,368 (+21)	+0.01%
sha256	1,732 (+2)	+0.12%	24,916 (0)	0.00%
sha256_var_witness_const_regression	1,370 (+2)	+0.15%	16,810 (0)	0.00%
regression_5252	32,898 (+9)	+0.03%	44,246 (-6)	-0.01%
slices	737 (-1)	-0.14%	3,836 (-2)	-0.05%
7_function	155 (-1)	-0.64%	2,968 (-2)	-0.07%
sha256_var_size_regression	11,505 (+64)	+0.56%	70,388 (-48)	-0.07%
hashmap	24,133 (+28)	+0.12%	62,509 (-64)	-0.10%
to_be_bytes	67 (-6)	-8.22%	118 (-3)	-2.48%
regression_6451	14 (-2)	-12.50%	27 (-1)	-3.57%
conditional_1	6,405 (+255)	+4.15%	16,044 (-690)	-4.12%

aakoshh · 2024-12-04T16:04:12Z

For example on the following example it looks like the MEO only runs once, which suggests there is another source of unstable output:

cargo test -p nargo_cli --bins -- test_transform_program_is_idempotent to_be_bytes

acvm-repo/acvm/src/compiler/mod.rs

acvm-repo/acvm/src/compiler/transformers/mod.rs

asterite

Looks great!

tooling/nargo_cli/src/cli/compile_cmd.rs

acvm-repo/acvm/src/compiler/transformers/mod.rs

aakoshh mentioned this pull request Dec 4, 2024

feat(test): Check that nargo::ops::transform_program is idempotent #6694

Merged

5 tasks

aakoshh changed the base branch from master to 6670-test-transform-is-idempotent December 4, 2024 11:56

Update the current_witness_index after optimisation

b59cb34

aakoshh force-pushed the 6670-fix-transform-idempotent branch from 4b370d6 to b59cb34 Compare December 4, 2024 11:59

Diff the debug output

02c53bd

TomAFrench reviewed Dec 4, 2024

View reviewed changes

acvm-repo/acvm/src/compiler/transformers/mod.rs Outdated Show resolved Hide resolved

aakoshh added 2 commits December 4, 2024 15:22

Fold over the witnesses

09fb747

Try to allow multiple passes with the MergeExpressionsOptimizer

97f75ba

aakoshh added 4 commits December 4, 2024 16:57

Trying to figure out where to put the loop

8cc215a

Only looping in transform

7c72b29

Allow 3 full transform passes

4a06523

Loop both inside and outside

b163c20

aakoshh marked this pull request as ready for review December 4, 2024 17:32

aakoshh requested review from TomAFrench, asterite and guipublic December 4, 2024 17:33

Exit after optimisation if it hasn't changed anything

ce9d107

aakoshh force-pushed the 6670-fix-transform-idempotent branch from 371678e to ce9d107 Compare December 4, 2024 17:37

asterite reviewed Dec 4, 2024

View reviewed changes

acvm-repo/acvm/src/compiler/mod.rs Show resolved Hide resolved

asterite reviewed Dec 4, 2024

View reviewed changes

acvm-repo/acvm/src/compiler/transformers/mod.rs Outdated Show resolved Hide resolved

asterite approved these changes Dec 4, 2024

View reviewed changes

asterite force-pushed the 6670-fix-transform-idempotent branch from 051be60 to ce9d107 Compare December 4, 2024 19:31

aakoshh added 3 commits December 4, 2024 19:42

Remove unused transform function

5229857

Only make max passes

99d1bdf

Add back transform, it was pub-use

fc21983

vezenovm reviewed Dec 5, 2024

View reviewed changes

tooling/nargo_cli/src/cli/compile_cmd.rs Outdated Show resolved Hide resolved

acvm-repo/acvm/src/compiler/transformers/mod.rs Show resolved Hide resolved

vezenovm approved these changes Dec 5, 2024

View reviewed changes

Remove commented exploratory looping

61efb67

aakoshh merged commit 82b38bd into 6670-test-transform-is-idempotent Dec 5, 2024

aakoshh deleted the 6670-fix-transform-idempotent branch December 5, 2024 17:18

aakoshh mentioned this pull request Dec 5, 2024

chore: optimise older opcodes in reverse order #6476

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Make `nargo::ops::transform_program` idempotent#6695

fix: Make `nargo::ops::transform_program` idempotent#6695
aakoshh merged 13 commits into6670-test-transform-is-idempotentfrom
6670-fix-transform-idempotent

aakoshh commented Dec 4, 2024 •

edited

Loading

Uh oh!

github-actions bot commented Dec 4, 2024 •

edited

Loading

Uh oh!

TomAFrench left a comment

Uh oh!

Uh oh!

aakoshh commented Dec 4, 2024

Uh oh!

aakoshh commented Dec 4, 2024

Uh oh!

github-actions bot commented Dec 4, 2024 •

edited

Loading

Uh oh!

aakoshh commented Dec 4, 2024

Uh oh!

Uh oh!

Uh oh!

asterite left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

aakoshh commented Dec 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Problem*

Summary*

Additional Context

Ever increasing current_witness_index

Multiple passes

Documentation*

PR Checklist*

Uh oh!

github-actions bot commented Dec 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Peak Memory Sample

Uh oh!

TomAFrench left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

aakoshh commented Dec 4, 2024

Uh oh!

aakoshh commented Dec 4, 2024

Uh oh!

github-actions bot commented Dec 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes to circuit sizes

🧾 Summary (10% most significant diffs)

Uh oh!

aakoshh commented Dec 4, 2024

Uh oh!

Uh oh!

Uh oh!

asterite left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

aakoshh commented Dec 4, 2024 •

edited

Loading

Ever increasing `current_witness_index`

github-actions bot commented Dec 4, 2024 •

edited

Loading

github-actions bot commented Dec 4, 2024 •

edited

Loading