-
Notifications
You must be signed in to change notification settings - Fork 11
Setting object_storage_remote_initiator #756
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| } | ||
| else | ||
| { | ||
| LOG_TEST( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that a case where we request whole object?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Iceberg metadata for example
:) CREATE DATABASE datalake ENGINE = Iceberg('http://rest:8181/v1', 'minio', 'minio123') SETTINGS catalog_type = 'rest', storage_endpoint = 'http://minio:9000/warehouse', warehouse = 'iceberg'
:) SELECT * FROM datalake.`iceberg.bids`
Query id: d1bb9862-c077-403f-9843-94fd28173760
┌───────────────────datetime─┬─symbol─┬────bid─┬────ask─┐
1. │ 2019-08-09 08:35:00.000000 │ AAPL │ 198.23 │ 195.45 │
2. │ 2019-08-09 08:35:00.000000 │ AAPL │ 198.25 │ 198.5 │
3. │ 2019-08-07 08:35:00.000000 │ AAPL │ 195.23 │ 195.28 │
4. │ 2019-08-07 08:35:00.000000 │ AAPL │ 195.22 │ 195.28 │
5. │ 2019-08-09 08:35:00.000000 │ AAPL │ 198.23 │ 195.45 │
6. │ 2019-08-09 08:35:00.000000 │ AAPL │ 198.25 │ 198.5 │
└────────────────────────────┴────────┴────────┴────────┘
:) select ProfileEvents['S3GetObject'] from system.query_log where type='QueryFinish' and query_id='d1bb9862-c077-403f-9843-94fd28173760'
┌─arrayElement⋯GetObject')─┐
1. │ 8 │
└──────────────────────────┘
...
grep "Read S3 object" /var/log/clickhouse-server/clickhouse-server.log
2025.05.07 22:38:10.414791 [ 80 ] {d1bb9862-c077-403f-9843-94fd28173760} <Test> ReadBufferFromS3: Read S3 object. Bucket: warehouse, Key: data/metadata/00003-ad725ef4-c28e-4ed4-aa4b-2e2aae0716d4.metadata.json, Version: Latest
2025.05.07 22:38:10.416600 [ 80 ] {d1bb9862-c077-403f-9843-94fd28173760} <Test> ReadBufferFromS3: Read S3 object. Bucket: warehouse, Key: data/metadata/snap-182060351258856937-0-ff436521-29e9-4437-be5b-eb60f209baa9.avro, Version: Latest
2025.05.07 22:38:10.418360 [ 80 ] {d1bb9862-c077-403f-9843-94fd28173760} <Test> ReadBufferFromS3: Read S3 object. Bucket: warehouse, Key: data/metadata/ff436521-29e9-4437-be5b-eb60f209baa9-m0.avro, Version: Latest
2025.05.07 22:38:10.420138 [ 80 ] {d1bb9862-c077-403f-9843-94fd28173760} <Test> ReadBufferFromS3: Read S3 object. Bucket: warehouse, Key: data/metadata/6f3e6993-47c9-4556-b70c-c6c48d2ced6f-m0.avro, Version: Latest
2025.05.07 22:38:10.421658 [ 80 ] {d1bb9862-c077-403f-9843-94fd28173760} <Test> ReadBufferFromS3: Read S3 object. Bucket: warehouse, Key: data/metadata/f0de1c43-e367-4e3d-8c9d-4076d8fb0cbd-m0.avro, Version: Latest
2025.05.07 22:38:10.426911 [ 767 ] {d1bb9862-c077-403f-9843-94fd28173760} <Test> ReadBufferFromS3: Read S3 object. Bucket: warehouse, Key: data/data/datetime_day=2019-08-09/00000-0-ff436521-29e9-4437-be5b-eb60f209baa9.parquet, Version: Latest, Range: 0-1643
2025.05.07 22:38:10.427003 [ 762 ] {d1bb9862-c077-403f-9843-94fd28173760} <Test> ReadBufferFromS3: Read S3 object. Bucket: warehouse, Key: data/data/datetime_day=2019-08-09/00000-0-6f3e6993-47c9-4556-b70c-c6c48d2ced6f.parquet, Version: Latest, Range: 0-1643
2025.05.07 22:38:10.427050 [ 770 ] {d1bb9862-c077-403f-9843-94fd28173760} <Test> ReadBufferFromS3: Read S3 object. Bucket: warehouse, Key: data/data/datetime_day=2019-08-07/00000-0-f0de1c43-e367-4e3d-8c9d-4076d8fb0cbd.parquet, Version: Latest, Range: 0-1635
I added this for consistency, all requests count in ProfileEvents['S3GetObject'], but in logs only part of requests.
…nitiator Setting object_storage_remote_initiator
…nitiator Setting object_storage_remote_initiator
…nitiator Setting object_storage_remote_initiator
…rage_remote_initiator 25.3 Antalya port of #756 - object storage cluster function
…nitiator Setting object_storage_remote_initiator
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Make remote call of object storage cluster function.
Documentation entry for user-facing changes
Execute query
as
where
swarm_nodeis a random node fromswarmcluster.Requirements -
swarmcluster must know about cluster with nameswarm. In 'classic' old way only local initiator must know aboutswarm.Also method
getDataFilesreturned (was removed as unused in ClickHouse#78775)And small optimization - reusingsample_pathin StorageObjectStorage (get once in StorageObjectStorageCluster), and gettingsample_pathfrom metadata inresolveSchemaAndFormatOptimization removed because of strange side effects - inconsistent column type detection (
LowCardinalityinstead ofNullablein some cases).