Skip to content

Conversation

@nico
Copy link
Contributor

@nico nico commented Nov 25, 2025

The big win is parallelizing the conversion (1.65s -> 0.23s), but some code optimization too (0.23s -> 0.2s).

Also make the CI script call the compiler.sh script, to make it pick up these speedups.

nico added 2 commits November 25, 2025 18:43
Now that all our jbig2 test files are json-generated,
let's keep it that way :^)
...when this is used in scripts that `set -u`.

No intended behavior change.
@github-actions github-actions bot added the 👀 pr-needs-review PR needs review from a maintainer or community member label Nov 25, 2025
Comment on lines +30 to +31
find "$DIR" -maxdepth 1 -name "*.json" -print0 | \
xargs -0 -P "$NPROC" -I {} bash -c 'run_jbig2 "$@"' _ {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guessing it has to do something with the synchronization, but does this in it current form need the bash -c?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also whats the _ doing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First question: This needs bash -c. functions in bash don't exist in other processes (such as xargs). They can't be exec'd or anything. They are put in the env as a bash thing:

% bash -c 'foobar () { echo hi; }; export -f foobar; env' |grep foobar
BASH_FUNC_foobar%%=() {  echo hi

…and due to this a child bash can execute a function exported in a parent bash. But a non-bash process can't. (Nothing to do with synchronization :))

Second question: It's just a literal _, passed as $0 (since run_jbig2 doesn't look at its $0).

nico added 3 commits November 25, 2025 19:51
This uses `xargs -P` as a fairly simple parallelism mechanism.  It
relies on `echo` being atomic in practice (for small enough outputs).

(Alternatively, each child could write to a dedicated log file in
some temp dir, and we could cat all the log files at the end. For
now, this seems like it's good enough.)

Takes the time to run compile.sh from 1.65s to 0.23s on my system.
That way, it picks up the recent speed-up in compile.sh.

To make this work, let compile.sh optionally take the path to
jbig2-from-json as first argument.

Takes the time to run Meta/check-jbig2-json.sh from 1.7s to 0.29s
on my system.
If the input image is black and white, shortcircuit all the fancy
processing.

Takes Tests/LibGfx/test-inputs/jbig2/json/compile.sh from 0.23s to
0.20s on my machine. (When running the tests serially, it reduced
the time for that from 1.72s to 1.32s on my machine.)

(I also tried putting a decoding cache in jbig2-from-image.cpp, but
this approach here is more general, simpler, and just as fast.)
@nico nico merged commit 64f08ba into SerenityOS:master Nov 26, 2025
12 checks passed
@nico nico deleted the jbig2-mines branch November 26, 2025 03:29
@github-actions github-actions bot removed the 👀 pr-needs-review PR needs review from a maintainer or community member label Nov 26, 2025
break;
}
}
if (input_is_bilevel) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't it change the value of the output when using dithering?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh, that's a good question. I thought it wouldn't, but maybe that's wrong? I thought dithering is about dealing with the error when having to convert gray values to just 0 and 255, but if the input is already 0 and 255, there's no error to deal with, right? Maybe my mental model here is off, though!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose if we had an ordered dither algorithm, this would definitely be wrong for that. But for the existing ones it's maybe ok?

But maybe we should just revert this, idk.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't Bayer an ordered dither algorithm?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry, I meant a "Clustered Dot Screening" ordered dither (in the language of https://engineering.purdue.edu/~bouman/ece637/notes/pdf/Halftoning.pdf). Bayer is a "Dispersed Dot Screen".

But you're right, they're mostly the same thing. They both do:

bilevel(i, j) = input(i, j) > threshold(i, j) ? 255 : 0;

and threshold returns some threshold >= 0 and < 1 (if input is normalized to >= 0 <= 1.0). If input(i, j) exactly 0, then bilevel(i, j) will be 0 too (because threshould >= 0), and if input(i, j) is exactly 1.0, then bielevel(i, j) will be 255. So I think this might be fine as-is?

It'd be good to have an actual example!

With this change reverted:

Build/lagom/bin/image --to-bilevel=bayer8x8 -o out.webp bitmap.bmp
Build/lagom/bin/imgcmp out.webp bitmap.bmp                        
different pixel at (0, 7), rgb(0, 0, 0) vs rgb(255, 255, 255)

Huh! But looking at the output, the ordered dithering description, and the code again, I think this is a bug in my bayer dithering implementation. With

@@ -314,7 +314,7 @@ ErrorOr<NonnullRefPtr<BilevelImage>> BilevelImage::create_from_bitmap(Gfx::Bitma
 
         for (int y = 0, i = 0; y < bitmap.height(); ++y) {
             for (int x = 0; x < bitmap.width(); ++x, ++i) {
-                u8 threshold = (bayer_matrix[(y & mask) * n + (x & mask)] * 255) / ((n * n) - 1);
+                u8 threshold = (bayer_matrix[(y & mask) * n + (x & mask)] * 255) / (n * n);
                 bilevel_image->set_bit(x, y, gray_bitmap[i] > threshold ? 0 : 1);
             }
         }

the difference goes away.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants