Commit 3ba7350
feat: Support recursive queries with a distinct 'UNION' (#18254)
Rely on aggregate GroupValues abstraction to build a hash table of the
emitted rows that is used to deduplicate
We might make things a bit more efficient by rewriting a hash table
wrapper just for deduplication, but this implementation should give a
fair baseline
## Which issue does this PR close?
- Closes #18140.
## Rationale for this change
Implements deduplicating recursive CTE (i.e. `UNION` inside of `WITH
RECURSIVE`) using a hash table. I reuse the one from aggregates to avoid
rebuilding a full wrapper and specialization for types. Each time a
batch is returned by the static or the recursive terms of the CTE, the
hash table is used to remove already seen rows before emitting the rows
and keeping them in memory for the next recursion step.
## What changes are included in this PR?
Reusing `GroupValues` trait implementations inside of
`RecursiveQueryExec` to get deduplication working.
## Are these changes tested?
Yes, some sqllogictests have been added, including ones that would lead
to infinite recursion is deduplication where disabled.
## Are there any user-facing changes?
No
---------
Co-authored-by: Andrew Lamb <[email protected]>1 parent 10706ae commit 3ba7350
File tree
4 files changed
+171
-33
lines changed- datafusion
- core/tests/data/recursive_cte
- expr/src/logical_plan
- physical-plan/src
- sqllogictest/test_files
4 files changed
+171
-33
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
55 | 55 | | |
56 | 56 | | |
57 | 57 | | |
58 | | - | |
| 58 | + | |
59 | 59 | | |
60 | 60 | | |
61 | 61 | | |
| |||
179 | 179 | | |
180 | 180 | | |
181 | 181 | | |
182 | | - | |
183 | | - | |
184 | | - | |
185 | | - | |
186 | | - | |
187 | | - | |
188 | 182 | | |
189 | 183 | | |
190 | 184 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
| 25 | + | |
| 26 | + | |
25 | 27 | | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
26 | 31 | | |
27 | | - | |
28 | | - | |
| 32 | + | |
| 33 | + | |
29 | 34 | | |
30 | | - | |
31 | | - | |
| 35 | + | |
| 36 | + | |
32 | 37 | | |
33 | 38 | | |
34 | 39 | | |
| |||
195 | 200 | | |
196 | 201 | | |
197 | 202 | | |
| 203 | + | |
198 | 204 | | |
199 | | - | |
| 205 | + | |
200 | 206 | | |
201 | 207 | | |
202 | 208 | | |
| |||
267 | 273 | | |
268 | 274 | | |
269 | 275 | | |
270 | | - | |
271 | | - | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
272 | 280 | | |
273 | 281 | | |
274 | 282 | | |
| |||
278 | 286 | | |
279 | 287 | | |
280 | 288 | | |
| 289 | + | |
281 | 290 | | |
282 | | - | |
| 291 | + | |
283 | 292 | | |
284 | 293 | | |
285 | 294 | | |
286 | | - | |
| 295 | + | |
| 296 | + | |
| 297 | + | |
| 298 | + | |
287 | 299 | | |
288 | 300 | | |
289 | 301 | | |
| |||
292 | 304 | | |
293 | 305 | | |
294 | 306 | | |
295 | | - | |
296 | | - | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
297 | 310 | | |
298 | 311 | | |
299 | 312 | | |
300 | 313 | | |
301 | 314 | | |
302 | 315 | | |
303 | | - | |
| 316 | + | |
304 | 317 | | |
| 318 | + | |
| 319 | + | |
| 320 | + | |
| 321 | + | |
| 322 | + | |
| 323 | + | |
305 | 324 | | |
306 | 325 | | |
307 | 326 | | |
308 | | - | |
309 | 327 | | |
| 328 | + | |
310 | 329 | | |
311 | 330 | | |
312 | 331 | | |
| |||
391 | 410 | | |
392 | 411 | | |
393 | 412 | | |
394 | | - | |
395 | 413 | | |
396 | 414 | | |
397 | 415 | | |
| |||
428 | 446 | | |
429 | 447 | | |
430 | 448 | | |
| 449 | + | |
| 450 | + | |
| 451 | + | |
| 452 | + | |
| 453 | + | |
| 454 | + | |
| 455 | + | |
| 456 | + | |
| 457 | + | |
| 458 | + | |
| 459 | + | |
| 460 | + | |
| 461 | + | |
| 462 | + | |
| 463 | + | |
| 464 | + | |
| 465 | + | |
| 466 | + | |
| 467 | + | |
| 468 | + | |
| 469 | + | |
| 470 | + | |
| 471 | + | |
| 472 | + | |
| 473 | + | |
| 474 | + | |
| 475 | + | |
| 476 | + | |
| 477 | + | |
| 478 | + | |
| 479 | + | |
| 480 | + | |
| 481 | + | |
| 482 | + | |
| 483 | + | |
| 484 | + | |
| 485 | + | |
| 486 | + | |
| 487 | + | |
| 488 | + | |
| 489 | + | |
| 490 | + | |
| 491 | + | |
| 492 | + | |
| 493 | + | |
| 494 | + | |
| 495 | + | |
| 496 | + | |
| 497 | + | |
| 498 | + | |
| 499 | + | |
| 500 | + | |
| 501 | + | |
| 502 | + | |
| 503 | + | |
| 504 | + | |
431 | 505 | | |
432 | 506 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
58 | 58 | | |
59 | 59 | | |
60 | 60 | | |
61 | | - | |
62 | | - | |
63 | | - | |
64 | | - | |
65 | | - | |
66 | | - | |
67 | | - | |
68 | | - | |
69 | | - | |
70 | | - | |
71 | | - | |
72 | | - | |
73 | 61 | | |
74 | 62 | | |
75 | 63 | | |
| |||
121 | 109 | | |
122 | 110 | | |
123 | 111 | | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
124 | 128 | | |
125 | 129 | | |
126 | 130 | | |
| |||
1044 | 1048 | | |
1045 | 1049 | | |
1046 | 1050 | | |
| 1051 | + | |
| 1052 | + | |
| 1053 | + | |
| 1054 | + | |
| 1055 | + | |
| 1056 | + | |
| 1057 | + | |
| 1058 | + | |
| 1059 | + | |
| 1060 | + | |
| 1061 | + | |
| 1062 | + | |
| 1063 | + | |
| 1064 | + | |
| 1065 | + | |
| 1066 | + | |
| 1067 | + | |
| 1068 | + | |
| 1069 | + | |
| 1070 | + | |
| 1071 | + | |
| 1072 | + | |
| 1073 | + | |
| 1074 | + | |
| 1075 | + | |
| 1076 | + | |
| 1077 | + | |
| 1078 | + | |
| 1079 | + | |
| 1080 | + | |
| 1081 | + | |
| 1082 | + | |
| 1083 | + | |
| 1084 | + | |
| 1085 | + | |
| 1086 | + | |
| 1087 | + | |
| 1088 | + | |
| 1089 | + | |
| 1090 | + | |
| 1091 | + | |
| 1092 | + | |
| 1093 | + | |
| 1094 | + | |
| 1095 | + | |
| 1096 | + | |
| 1097 | + | |
| 1098 | + | |
| 1099 | + | |
| 1100 | + | |
| 1101 | + | |
| 1102 | + | |
| 1103 | + | |
| 1104 | + | |
| 1105 | + | |
| 1106 | + | |
| 1107 | + | |
| 1108 | + | |
| 1109 | + | |
| 1110 | + | |
1047 | 1111 | | |
1048 | 1112 | | |
1049 | 1113 | | |
| |||
0 commit comments