Skip to content

Conversation

@ianton-ru
Copy link

@ianton-ru ianton-ru commented Apr 28, 2025

Changelog category (leave one):

  • New Feature

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Make remote call of object storage cluster function.

Documentation entry for user-facing changes

Execute query

SELECT * FROM s3Cluster('swarm', ....) SETTINGS object_storage_remote_initiator=true

as

SELECT * FROM remote('swarm_node', s3Cluster('swarm', ....))

where swarm_node is a random node from swarm cluster.

Requirements - swarm cluster must know about cluster with name swarm. In 'classic' old way only local initiator must know about swarm.

Also method getDataFiles returned (was removed as unused in ClickHouse#78775)

And small optimization - reusing sample_path in StorageObjectStorage (get once in StorageObjectStorageCluster), and getting sample_path from metadata in resolveSchemaAndFormat
Optimization removed because of strange side effects - inconsistent column type detection (LowCardinality instead of Nullable in some cases).

@svb-alt svb-alt added the antalya-25.2.2 Planned for 25.2.2 release label May 6, 2025
}
else
{
LOG_TEST(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that a case where we request whole object?

Copy link
Author

@ianton-ru ianton-ru May 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Iceberg metadata for example

:) CREATE DATABASE datalake ENGINE = Iceberg('http://rest:8181/v1', 'minio', 'minio123') SETTINGS catalog_type = 'rest', storage_endpoint = 'http://minio:9000/warehouse', warehouse = 'iceberg'
:) SELECT * FROM datalake.`iceberg.bids`
Query id: d1bb9862-c077-403f-9843-94fd28173760

   ┌───────────────────datetime─┬─symbol─┬────bid─┬────ask─┐
1. │ 2019-08-09 08:35:00.000000 │ AAPL   │ 198.23 │ 195.45 │
2. │ 2019-08-09 08:35:00.000000 │ AAPL   │ 198.25 │  198.5 │
3. │ 2019-08-07 08:35:00.000000 │ AAPL   │ 195.23 │ 195.28 │
4. │ 2019-08-07 08:35:00.000000 │ AAPL   │ 195.22 │ 195.28 │
5. │ 2019-08-09 08:35:00.000000 │ AAPL   │ 198.23 │ 195.45 │
6. │ 2019-08-09 08:35:00.000000 │ AAPL   │ 198.25 │  198.5 │
   └────────────────────────────┴────────┴────────┴────────┘

:) select ProfileEvents['S3GetObject'] from system.query_log where type='QueryFinish' and query_id='d1bb9862-c077-403f-9843-94fd28173760'

   ┌─arrayElement⋯GetObject')─┐
1. │                        8 │
   └──────────────────────────┘
...

grep "Read S3 object" /var/log/clickhouse-server/clickhouse-server.log

2025.05.07 22:38:10.414791 [ 80 ] {d1bb9862-c077-403f-9843-94fd28173760} <Test> ReadBufferFromS3: Read S3 object. Bucket: warehouse, Key: data/metadata/00003-ad725ef4-c28e-4ed4-aa4b-2e2aae0716d4.metadata.json, Version: Latest
2025.05.07 22:38:10.416600 [ 80 ] {d1bb9862-c077-403f-9843-94fd28173760} <Test> ReadBufferFromS3: Read S3 object. Bucket: warehouse, Key: data/metadata/snap-182060351258856937-0-ff436521-29e9-4437-be5b-eb60f209baa9.avro, Version: Latest
2025.05.07 22:38:10.418360 [ 80 ] {d1bb9862-c077-403f-9843-94fd28173760} <Test> ReadBufferFromS3: Read S3 object. Bucket: warehouse, Key: data/metadata/ff436521-29e9-4437-be5b-eb60f209baa9-m0.avro, Version: Latest
2025.05.07 22:38:10.420138 [ 80 ] {d1bb9862-c077-403f-9843-94fd28173760} <Test> ReadBufferFromS3: Read S3 object. Bucket: warehouse, Key: data/metadata/6f3e6993-47c9-4556-b70c-c6c48d2ced6f-m0.avro, Version: Latest
2025.05.07 22:38:10.421658 [ 80 ] {d1bb9862-c077-403f-9843-94fd28173760} <Test> ReadBufferFromS3: Read S3 object. Bucket: warehouse, Key: data/metadata/f0de1c43-e367-4e3d-8c9d-4076d8fb0cbd-m0.avro, Version: Latest
2025.05.07 22:38:10.426911 [ 767 ] {d1bb9862-c077-403f-9843-94fd28173760} <Test> ReadBufferFromS3: Read S3 object. Bucket: warehouse, Key: data/data/datetime_day=2019-08-09/00000-0-ff436521-29e9-4437-be5b-eb60f209baa9.parquet, Version: Latest, Range: 0-1643
2025.05.07 22:38:10.427003 [ 762 ] {d1bb9862-c077-403f-9843-94fd28173760} <Test> ReadBufferFromS3: Read S3 object. Bucket: warehouse, Key: data/data/datetime_day=2019-08-09/00000-0-6f3e6993-47c9-4556-b70c-c6c48d2ced6f.parquet, Version: Latest, Range: 0-1643
2025.05.07 22:38:10.427050 [ 770 ] {d1bb9862-c077-403f-9843-94fd28173760} <Test> ReadBufferFromS3: Read S3 object. Bucket: warehouse, Key: data/data/datetime_day=2019-08-07/00000-0-f0de1c43-e367-4e3d-8c9d-4076d8fb0cbd.parquet, Version: Latest, Range: 0-1635

I added this for consistency, all requests count in ProfileEvents['S3GetObject'], but in logs only part of requests.

@svb-alt svb-alt removed the antalya-25.2.2 Planned for 25.2.2 release label May 12, 2025
@Enmk Enmk merged commit 1b8b4a9 into antalya May 14, 2025
327 of 346 checks passed
ianton-ru pushed a commit that referenced this pull request Jun 3, 2025
…nitiator

Setting object_storage_remote_initiator
ianton-ru pushed a commit that referenced this pull request Jun 3, 2025
…nitiator

Setting object_storage_remote_initiator
ianton-ru pushed a commit that referenced this pull request Jun 4, 2025
…nitiator

Setting object_storage_remote_initiator
Enmk added a commit that referenced this pull request Jun 4, 2025
…rage_remote_initiator

25.3 Antalya port of #756 - object storage cluster function
@svb-alt svb-alt added antalya-25.6 port-antalya PRs to be ported to all new Antalya releases and removed antalya-25.6 labels Jul 14, 2025
ianton-ru pushed a commit that referenced this pull request Sep 9, 2025
…nitiator

Setting object_storage_remote_initiator
Enmk added a commit that referenced this pull request Sep 9, 2025
ianton-ru pushed a commit that referenced this pull request Oct 13, 2025
ianton-ru pushed a commit that referenced this pull request Oct 13, 2025
Enmk added a commit that referenced this pull request Oct 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

port-antalya PRs to be ported to all new Antalya releases

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants