Skip to content

Conversation

@fressi-elastic
Copy link
Contributor

@fressi-elastic fressi-elastic commented Jul 17, 2025

It allows to use the new multipart transfer manager for downloading track files.

Configuration documentation.

  • It adds the new track section

Adapter interface.

  • It renames adapter methods parameters: head -> want.

Client class.

  • Cache TTL values are now configurable (60. seconds by default).
  • It uses the new CachedHead `class to store cached heads.
  • Client.head method raises CachedHeadError to wrap errors that have been previously cached.
  • Client.resolve method only logs errors that hasn't been obtaining from the cache (I.E. when capturing a CachedHeadError).
  • Client.get refactored to properly handle the resolved head filtering.

Config

  • The new StorageConfig class is now used to pack values parsed from the esrally.condig.Config object and pass values through objects of the storage package. This is nice to reduce the effort of keeping distinction between user configured values and actual object variables. It also make easier to centralize configuration parsing and validation to one single place.

Track loader

  • It add support for the new transfer manager as optional downloading engine by a track.downloader.multipart_enabled configuration flag.
  • It lazily sets up the transfer manager the first time it requires for it so that its thread pool is created only in the actor process where it is required.
  • It uses a global logger instance for logging messages. It relies on the lazy params formatting for skipping unnecessary formatting.

Threads utils

  • Check interval value when initializing ContinuousTimer class. The value is now public.

Testing

  • It updates unit tests after above changes.

Note this requited the following PRs to be merged first:

@fressi-elastic fressi-elastic changed the title It enables the new multipart tranfer manager inside track downloader. It enables the new multipart transfer manager inside track downloader. Jul 18, 2025
@fressi-elastic
Copy link
Contributor Author

@fressi-elastic Regarding IT failure, boto3 became mandatory due to S3Adapter. Maybe keep it optional for BWC for now and reduce default adapters to HTTPAdapter ?

S3Adapter is only required to access buckets using credentials. For public buckets we should be safe using HTTPAdapter even with S3. The difference is in the URL as specified in the documentation. rclone should be kind enoght to set up permissions for uploading files with public read access on S3 buckets, so if we want to make publicly available some mirrors then se can omit boto3 installation for general availability of those mirros.

BTW when boto3 is not available, storage client should fail registering S3Adapter and simply skip it. On such a case s3:// url will not be supported, but the same bucket could be reached using the appropriate https:// URL instead.

@fressi-elastic
Copy link
Contributor Author

After last changes in the logging setup I got this results:

logging.json
rally.log

(esrally) esrally race --track=geonames --challenge=append-no-conflicts-index-only --distribution-version=8.15.5 --enable-assertions --kill-running-processes --on-error="abort" --test-mode

    ____        ____
   / __ \____ _/ / /_  __
  / /_/ / __ `/ / / / / /
 / _, _/ /_/ / / / /_/ /
/_/ |_|\__,_/_/_/\__, /
                /____/

[INFO] Race id is [90d4b8ef-f620-4434-82c8-e2865a7352dd]
[INFO] Preparing for race ...
[INFO] Downloading Elasticsearch 8.15.5 (426.4MB total size)                        [100%]
[INFO] Decompressing track data from [/Users/fressi/.rally/benchmarks/data/geonames/documents-2-1k.json.bz2] to [/Users/fressi/.rally/benchmarks/data/geonames/documents-2-1k.json] ... [OK]
[INFO] Preparing file offset table for [/Users/fressi/.rally/benchmarks/data/geonames/documents-2-1k.json] ... [OK]
[INFO] Racing on track [geonames], challenge [append-no-conflicts-index-only] and car ['defaults'] with version [8.15.5].

Running delete-index                                                           [100% done]
Running create-index                                                           [100% done]
Running check-cluster-health                                                   [100% done]
Running index-append                                                           [100% done]
Running force-merge                                                            [100% done]
Running wait-until-merges-finish                                               [100% done]

------------------------------------------------------
    _______             __   _____
   / ____(_)___  ____ _/ /  / ___/_________  ________
  / /_  / / __ \/ __ `/ /   \__ \/ ___/ __ \/ ___/ _ \
 / __/ / / / / / /_/ / /   ___/ / /__/ /_/ / /  /  __/
/_/   /_/_/ /_/\__,_/_/   /____/\___/\____/_/   \___/
------------------------------------------------------
            
|                                                         Metric |         Task |          Value |   Unit |
|---------------------------------------------------------------:|-------------:|---------------:|-------:|
|                     Cumulative indexing time of primary shards |              |    0.00961667  |    min |
|             Min cumulative indexing time across primary shards |              |    0.0012      |    min |
|          Median cumulative indexing time across primary shards |              |    0.00166667  |    min |
|             Max cumulative indexing time across primary shards |              |    0.00298333  |    min |
|            Cumulative indexing throttle time of primary shards |              |    0           |    min |
|    Min cumulative indexing throttle time across primary shards |              |    0           |    min |
| Median cumulative indexing throttle time across primary shards |              |    0           |    min |
|    Max cumulative indexing throttle time across primary shards |              |    0           |    min |
|                        Cumulative merge time of primary shards |              |    0           |    min |
|                       Cumulative merge count of primary shards |              |    0           |        |
|                Min cumulative merge time across primary shards |              |    0           |    min |
|             Median cumulative merge time across primary shards |              |    0           |    min |
|                Max cumulative merge time across primary shards |              |    0           |    min |
|               Cumulative merge throttle time of primary shards |              |    0           |    min |
|       Min cumulative merge throttle time across primary shards |              |    0           |    min |
|    Median cumulative merge throttle time across primary shards |              |    0           |    min |
|       Max cumulative merge throttle time across primary shards |              |    0           |    min |
|                      Cumulative refresh time of primary shards |              |    0.00585     |    min |
|                     Cumulative refresh count of primary shards |              |   25           |        |
|              Min cumulative refresh time across primary shards |              |    0.00075     |    min |
|           Median cumulative refresh time across primary shards |              |    0.00126667  |    min |
|              Max cumulative refresh time across primary shards |              |    0.0015      |    min |
|                        Cumulative flush time of primary shards |              |    0           |    min |
|                       Cumulative flush count of primary shards |              |    0           |        |
|                Min cumulative flush time across primary shards |              |    0           |    min |
|             Median cumulative flush time across primary shards |              |    0           |    min |
|                Max cumulative flush time across primary shards |              |    0           |    min |
|                                        Total Young Gen GC time |              |    0.009       |      s |
|                                       Total Young Gen GC count |              |    1           |        |
|                                          Total Old Gen GC time |              |    0           |      s |
|                                         Total Old Gen GC count |              |    0           |        |
|                                                   Dataset size |              |    7.59894e-05 |     GB |
|                                                     Store size |              |    7.59894e-05 |     GB |
|                                                  Translog size |              |    2.56114e-07 |     GB |
|                                         Heap used for segments |              |    0           |     MB |
|                                       Heap used for doc values |              |    0           |     MB |
|                                            Heap used for terms |              |    0           |     MB |
|                                            Heap used for norms |              |    0           |     MB |
|                                           Heap used for points |              |    0           |     MB |
|                                    Heap used for stored fields |              |    0           |     MB |
|                                                  Segment count |              |   15           |        |
|                                    Total Ingest Pipeline count |              |    0           |        |
|                                     Total Ingest Pipeline time |              |    0           |      s |
|                                   Total Ingest Pipeline failed |              |    0           |        |
|                                                 Min Throughput | index-append | 9425.44        | docs/s |
|                                                Mean Throughput | index-append | 9425.44        | docs/s |
|                                              Median Throughput | index-append | 9425.44        | docs/s |
|                                                 Max Throughput | index-append | 9425.44        | docs/s |
|                                        50th percentile latency | index-append |  108.261       |     ms |
|                                       100th percentile latency | index-append |  128.645       |     ms |
|                                   50th percentile service time | index-append |  108.261       |     ms |
|                                  100th percentile service time | index-append |  128.645       |     ms |
|                                                     error rate | index-append |    0           |      % |


--------------------------------
[INFO] SUCCESS (took 67 seconds)
--------------------------------

Copy link
Contributor

@gbanasiak gbanasiak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you for your patience.

@fressi-elastic fressi-elastic enabled auto-merge (squash) October 21, 2025 13:27
@fressi-elastic fressi-elastic merged commit 8c94094 into elastic:master Oct 21, 2025
15 checks passed
@fressi-elastic fressi-elastic deleted the track.loader branch October 21, 2025 21:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants