Skip to content

Implement Sequence Distribution feature with Striding#2094

Open
vadlakondaswetha wants to merge 1 commit into
axboe:masterfrom
vadlakondaswetha:rand_dist
Open

Implement Sequence Distribution feature with Striding#2094
vadlakondaswetha wants to merge 1 commit into
axboe:masterfrom
vadlakondaswetha:rand_dist

Conversation

@vadlakondaswetha

@vadlakondaswetha vadlakondaswetha commented May 22, 2026

Copy link
Copy Markdown

This PR introduces a new generic random distribution primitive: sequence.

While fio currently provides primitives for linear, pseudo-random
(randread), and strided access patterns, it lacks a mechanism to simulate
deterministic, local non-linearity within a repeating stride.

This access pattern is required to test modern workloads like:

LLM Inference Weights Loading
Reading .safetensors model files which contain a list of tensor files.
Even though the file is read in sequential mode, within a tensor, data
will be requested in a particular order which is not purely sequential.

Database Engine Log-Merging / LSM-Trees

Scenarios where specific block indices (like parity blocks or metadata
headers) must be systematically read out-of-order within every repeating
chunk or block group.

Existing option is to use read_iolog. Using a generated read_iolog for multi-terabyte model benchmarks is
sub-optimal as the trace files become massive, unscalable, and lack
dynamic flexibility across varying block sizes. The sequence
distribution resolves this by calculating offsets algorithmically on
the fly.

NEW OPTIONS INTRODUCED

  • random_distribution=sequence:N1,N2,N3 : Intercepts the random offset generator to utilize the pattern sequence.
  • random_sequence_stride: Boolean option to modify sequence distribution behavior.
  • When disabled (0, default), the sequence loops over absolute block indices.
  • When enabled (1), it switches to a Strided Block Group pattern, advancing the base block index by the sequence length after each cycle to progress through the file.

@sitsofe

sitsofe commented May 25, 2026

Copy link
Copy Markdown
Collaborator

@vadlakondaswetha:
(I'm going purely on the description in your PR) What is this providing over something like:

./fio --debug=io --name=go --rw=randread --size=64k --zonemode=strided --ioengine=null --zonesize=16k

which performs I/O within the zone at random offsets but restricted to a given zone until it's "full" (at which point it moves on to the next zone)? Is it that the precise order of access within a zone is under directly user specified rather than being sequential or random?

@vadlakondaswetha

vadlakondaswetha commented May 26, 2026

Copy link
Copy Markdown
Author

@vadlakondaswetha: (I'm going purely on the description in your PR) What is this providing over something like:

./fio --debug=io --name=go --rw=randread --size=64k --zonemode=strided --ioengine=null --zonesize=16k

which performs I/O within the zone at random offsets but restricted to a given zone until it's "full" (at which point it moves on to the next zone)? Is it that the precise order of access within a zone is under directly user specified rather than being sequential or random?

Thats right. This provides reading data in a precise order controlled by the user.

Right now, repeating the same sequence across the entire file is controlled by random_sequence_stride option. We can remove this and achieve the same by the combination of random_distribution and zone option. Its a bit complex to configure. Eg:

--random_distribution=sequence:2,0,1 --zonemode=strided --zonesize=12k --zonerange=12k --zoneskip=0

@sitsofe

sitsofe commented May 29, 2026

Copy link
Copy Markdown
Collaborator

@vadlakondaswetha:
I see - you did mention just what happens in the commit message and I foolishly missed it so that's on me. I'm not promising to re-review (I'm not saying I don't like the PR just that these days I don't have much time and others tend to a more timely job).

  1. For a new feature we're going to need man page (fio.1) and HOWTO (HOWTO.rst) documentation to be part of the commit
  2. An example file with comments would be helpful but possibly point 1. will be enough...
  3. I'm a bit unsure of this:
    if (td->o.random_sequence) {
    	free(td->o.random_sequence);
    	td->o.random_sequence = NULL;
    }
    
    td->o.random_sequence = malloc(td->o.random_sequence_nr * sizeof(unsigned int));
    if (!td->o.random_sequence) {
    	free(str);
    	return 1;
    }
    because it feels strange to reallocating option values but I'm not that familiar with fio's option parsing so perhaps this is pre-existing pattern.

@ankit-sam ankit-sam left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @vadlakondaswetha I added a couple of review comments, please check them.

Apart from what @sitsofe mentioned about the documentation and example file, I have few other concerns

  1. The sequence values are not checked, so user can pass duplicate values. This breaks norandommap=0 behavior.
  2. For mix workloads, do we need separate sequence like random_sequence_read and random_sequence_write? As in that case both read and write offsets will be identical everytime.
  3. Not urgent, but I think it will be great to support ranges like 0-5, 10-20. we already have options like bsrange, plids etc.

Comment thread io_u.c Outdated
Comment thread t/sequence.py Outdated
Comment thread t/sequence.py Outdated
Comment thread options.c
This change extends fio to support a new random distribution pattern
called `sequence`. It allows users to specify a fixed repeating
sequence of block indices for I/O operations using the syntax
`random_distribution=sequence:2,0,1`.

Additionally, it introduces the `random_sequence_stride` boolean option.
When enabled (1), the sequence progresses through the file as a Strided
Block Group pattern (e.g., 2,0,1, 5,3,4, 8,6,7...), automatically
advancing the base block index by the sequence length after each cycle.

Integration tests are added in `t/sequence.py`.

Signed-off-by: Swetha Vadlakonda <swethv@google.com>
@vadlakondaswetha

Copy link
Copy Markdown
Author

Hi @vadlakondaswetha I added a couple of review comments, please check them.

Apart from what @sitsofe mentioned about the documentation and example file, I have few other concerns

  1. The sequence values are not checked, so user can pass duplicate values. This breaks norandommap=0 behavior.
  2. For mix workloads, do we need separate sequence like random_sequence_read and random_sequence_write? As in that case both read and write offsets will be identical everytime.
  3. Not urgent, but I think it will be great to support ranges like 0-5, 10-20. we already have options like bsrange, plids etc.

Thanks for the review.

  1. In fio's core init.c (https://github.com/axboe/fio/blob/fio-3.42/init.c#L1025), any use of a non-uniform random distribution automatically overrides and turns off the random map. norandommap gets disabled for sequence pattern also (FIO_RAND_DIST_SEQUENCE != FIO_RAND_DIST_RANDOM)
  2. When running mixed workloads with standard non-uniform distribution, fio applies identical distribution formulas to both reads and writes. Same behavior applies for sequence config also. Incase they need to apply different patterns, its better to use different jobs.
  3. Thats a good addition. But I would like it keep it out of this PR.

@vadlakondaswetha

Copy link
Copy Markdown
Author

@vadlakondaswetha: I see - you did mention just what happens in the commit message and I foolishly missed it so that's on me. I'm not promising to re-review (I'm not saying I don't like the PR just that these days I don't have much time and others tend to a more timely job).

  1. For a new feature we're going to need man page (fio.1) and HOWTO (HOWTO.rst) documentation to be part of the commit

  2. An example file with comments would be helpful but possibly point 1. will be enough...

  3. I'm a bit unsure of this:

    if (td->o.random_sequence) {
    	free(td->o.random_sequence);
    	td->o.random_sequence = NULL;
    }
    
    td->o.random_sequence = malloc(td->o.random_sequence_nr * sizeof(unsigned int));
    if (!td->o.random_sequence) {
    	free(str);
    	return 1;
    }

    because it feels strange to reallocating option values but I'm not that familiar with fio's option parsing so perhaps this is pre-existing pattern.

Thanks for the review :)

1 & 2: Added required files
3. this is absolutely necessary because fio's option parsing callbacks (.cb) are evaluated incrementally as options are read. If an option is specified multiple times (for example, overridden in a job section or passed again on the CLI), the callback is entirely responsible for storing and managing the value. Since our value is a dynamically sized array of integers (unsigned int *), we must free() any previous allocation before performing a new malloc() to prevent a memory leak during repeated or overridden option evaluations.

@vadlakondaswetha

Copy link
Copy Markdown
Author

@sitsofe and @ankit-sam - PTAL, replied to your comments. Thanks.

@vadlakondaswetha

vadlakondaswetha commented Jun 18, 2026

Copy link
Copy Markdown
Author

Hi @vincentkfu / @axboe , Gentle ping on this PR review. PTAL and let me know if there are any concerns/questions for adding this feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants