Skip to content

Conversation

@sz3
Copy link
Owner

@sz3 sz3 commented Sep 28, 2025

For the first pass at the web decoder (#134), I introduced a new C api for decoding. It currently works like this:

  • cimbard_scan_extract_decode -> on independent threads, input rgb image data, output 1-N fountain buffers
  • cimbard_fountain_decode -> on the (1) reassembly thread, to reassemble the fountain buffers
  • cimbard_finish_copy -> on the reassembly thread, when cimbard_fountain_decode reports success. output one contiguous buffer of compressed file data

And then a set of zstd apis, which frankly could be ditched for an off-the-shelf zstd implementation:

  • cimbarz_init_decompress -> input the contiguous buffer of compressed data. Set state -- we decompress one file at a time.
  • cimbarz_decompress_read -> read the next chunk of decompressed data

This is changing for a few reasons:

  1. it can be simpler. The api surface has some required "bloat" regardless: besides the basic workflow outlined above, we additionally have (necessary, non-negotiable) api functions for various buffer sizes. So anywhere we can simplify is good (imo)
  2. the lifetime of the compressed file data buff is weird.
    • cimbarz_init_decompress and cimbarz_decompress_read don't copy the data, they just read from the buff. That means if something unexpected happens to the buffer, we're gonna crash. Or more to the point, it means that the caller (the user of the api) now needs to worry about the lifetime of this intermediate object.

Luckily, there's no real need to expose this intermediate compressed representation (short of for testing purposes -- but we can work around that). So we can collapse a few methods down. The new api is:

  • cimbard_scan_extract_decode -> on independent threads, input rgb image data, output 1-N fountain buffers. (same as before)
  • cimbard_fountain_decode -> on the (1) reassembly thread, to reassemble the fountain buffers (same as before)
  • cimbard_decompress_read -> on the reassembly thread, when cimbard_fountain_decode reports success. read the next chunk of decompressed data, or the first chunk (if it's the first call)

So: we've collapsed the last 3 calls into 1, and gone from 5 to 3 as far as the main flow is concerned. Also, the intermediate buffer is now internal state, so the caller doesn't have to think about it too much.

sz3 added 6 commits September 27, 2025 19:24
The flow will now be:

* cimbard_scan_extract_decode (loop until done)
* cimbard_fountain_decode
* cimbard_decompress_read (loop until done)

This eliminates the awkward "make sure you keep the zstd compress memory
block around until the decompress finishes, or else (:" footgun, and 2
api calls to boot.

It *does* make testing slightly trickier, so we'll keep some
internal/testing methods to help test the intermediate step (recover but
don't decompress)

(changes to the javascript code forthcoming...)
Not sure yet what to do with the zstd test page -- might just delete it.
The *automated* js test, OTOH, needs updating...

decompress_on_store<std::ofstream>(outpath, true)(filename, data);
int res = 1;
while ((res = cimbard_decompress_read(fileId, data.data(), data.size())) > 0)
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually using the full api here now.

, "_cimbard_get_filename"
, "_cimbard_finish_copy"
, "_cimbard_get_decompress_bufsize"
, "_cimbard_decompress_read"
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wasm export list acts as a public api reference for the moment...

{
if (_reassembled.empty())
return nullptr;
return _reassembled.data();
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing only, please

if (!_sink->recover(id, buffer, size))
return -2;
return 0;
int res = recover_contents(id);
Copy link
Owner Author

@sz3 sz3 Sep 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notably, both decompress_read() and get_filename() will call recover_contents() (the result will be cached), since they both need to pull the data from the zstd stream. Definitionally, then, you can only call either after cimbard_fountain_decode() has finished reassembling the file.

var height = window.innerHeight - 10;
Main.scaleCanvas(canvas, width, height);
Main.alignInvisibleClick(canvas);
Main.checkNavButtonOverlap();
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Completely unrelated little bonus UI fix here. Hide the nav button if it overlaps the barcode.

}, false);

window.addEventListener('resize', () => {
Main.resize();
Copy link
Owner Author

@sz3 sz3 Sep 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another UI tweak: listen for the resize event

QUnit.config.reorder = false;
QUnit.config.testTimeout = 10000;

var Zstd = function () {
Copy link
Owner Author

@sz3 sz3 Sep 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unexpected upside of changing the api: we're now testing more of the zstd code path in the JS test.

}
// this needs to happen after decompress() completes
// currently decompress is sync, so it's fine. But...
Module._free(dataPtr);
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^

Happy to make this memory cleanup race go away.

@sz3 sz3 merged commit a10cbde into master Sep 28, 2025
11 checks passed
@sz3 sz3 deleted the simplify-decode-capi branch September 28, 2025 13:44
@sz3 sz3 mentioned this pull request Sep 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants