|
| 1 | +# Zstandard Seekable Format |
| 2 | + |
| 3 | +The seekable format splits compressed data into a series of independent "frames", |
| 4 | +each compressed individually, |
| 5 | +so that decompression of a section in the middle of an archive |
| 6 | +only requires zstd to decompress at most a frame's worth of extra data, |
| 7 | +instead of the entire archive. |
| 8 | + |
| 9 | +The frames are appended, so that the decompression of the entire payload |
| 10 | +still regenerates the original content, using any compliant zstd decoder. |
| 11 | + |
| 12 | +On top of that, the seekable format generates a jump table, |
| 13 | +which makes it possible to jump directly to the position of the relevant frame |
| 14 | +when requesting only a segment of the data. |
| 15 | +The jump table is simply ignored by zstd decoders unaware of the seekable format. |
| 16 | + |
| 17 | +The format is delivered with an API to create seekable archives |
| 18 | +and to retrieve arbitrary segments inside the archive. |
| 19 | + |
| 20 | +### Maximum Frame Size parameter |
| 21 | + |
| 22 | +When creating a seekable archive, the main parameter is the maximum frame size. |
| 23 | + |
| 24 | +At compression time, user can manually select the boundaries between segments, |
| 25 | +but they don't have to: long segments will be automatically split |
| 26 | +when larger than selected maximum frame size. |
| 27 | + |
| 28 | +Small frame sizes reduce decompression cost when requesting small segments, |
| 29 | +because the decoder will nonetheless have to decompress an entire frame |
| 30 | +to recover just a single byte from it. |
| 31 | + |
| 32 | +A good rule of thumb is to select a maximum frame size roughly equivalent |
| 33 | +to the access pattern when it's known. |
| 34 | +For example, if the application tends to request 4KB blocks, |
| 35 | +then it's a good idea to set a maximum frame size in the vicinity of 4 KB. |
| 36 | + |
| 37 | +But small frame sizes also reduce compression ratio, |
| 38 | +and increase the cost for the jump table, |
| 39 | +so there is a balance to find. |
| 40 | + |
| 41 | +In general, try to avoid really tiny frame sizes (<1 KB), |
| 42 | +which would have a large negative impact on compression ratio. |
0 commit comments