Skip to content

Conversation

@0ax1
Copy link
Contributor

@0ax1 0ax1 commented Jun 2, 2025

For queries 2, 3, 10, 18 and 21 the TPC-H spec defines a row count limit.

2.1.2.9 Queries 2, 3, 10, 18 and 21 require that a given number of rows are to be returned (e.g., “Return the first 10 selected
rows”). If N is the number of rows to be returned, the query must return exactly the first N rows unless fewer than N
rows qualify, in which case all rows must be returned. There are three permissible ways of satisfying this
requirement. A test sponsor must select any one of them and use it consistently for all the queries that require that a
specified number of rows be returned.

https://www.tpc.org/tpc_documents_current_versions/pdf/tpc-h_v2.17.1.pdf

Which issue does this PR close?

Rationale for this change

Returned row counts should match the TPC-H spec.

What changes are included in this PR?

limit clause was added to TPC-H queries, 2, 3, 10, 18, 21.

Are these changes tested?

I re-ran all of the affected queries locally to double-check their row counts.

Are there any user-facing changes?

Users will now see the correct row counts when running TPC-H benchmarks.

For queries 2,3,10,18 and 21 the TPC-H spec defines a row count limit.

```
2.1.2.9 Queries 2, 3, 10, 18 and 21 require that a given number of rows are to be returned (e.g., “Return the first 10 selected
rows”). If N is the number of rows to be returned, the query must return exactly the first N rows unless fewer than N
rows qualify, in which case all rows must be returned. There are three permissible ways of satisfying this
requirement. A test sponsor must select any one of them and use it consistently for all the queries that require that a
specified number of rows be returned.
```

https://www.tpc.org/tpc_documents_current_versions/pdf/tpc-h_v2.17.1.pdf
Copy link
Contributor

@jonathanc-n jonathanc-n left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @0ax1 this looks good to me!

Copy link
Contributor

@2010YOUY01 2010YOUY01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the catch! I checked https://github.com/duckdb/duckdb/tree/main/extension/tpch/dbgen/queries and those limits are the same.

@xudong963 xudong963 merged commit 8b9b2fc into apache:main Jun 3, 2025
27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

TPC-H queries used in DataFusion are missing limit clause

4 participants