-
Notifications
You must be signed in to change notification settings - Fork 607
Description
Describe the feature
Introduce a rolling_window macro that allows users to calculate trailing-window aggregates (e.g., 7-day rolling sums) by duplicating fact table rows into the preceding N days and then grouping.
Describe alternatives you've considered
The problem with WINDOW functions is that the table may be missing rows for certain days, then some data will be also missing from the output (even though it shouldn't).
Example: Daily Rainfall Averages
We have a rainfall_daily with the average rainfall per city per day, but only when it rained.
| city | day_dt | mm_rain |
|---|---|---|
| Oslo | 2023-08-25 | 4.2 |
| Oslo | 2023-08-27 | 3.5 |
| Oslo | 2023-08-30 | 6.1 |
Now our task is to calculate the 7-day rolling rainfall average. If we use a window function (SUM(mm_rain) OVER (PARTITION BY city ORDER BY day_dt RANGE BETWEEN INTERVAL '6 days' PRECEDING AND CURRENT ROW)),
we only get rows for the dates in the source table (25, 27, 30). But we actually want results for every date (25 → 31), so that a dashboard chart shows a continuous line. On August 31, for example, Oslo’s rainfall average should include rain on the 25th, 27th, and 30th (even though Aug 31 has no row in the table).
| city | end_date | start_date | sum_mm_rain | avg_mm_rain |
|---|---|---|---|---|
| Oslo | 2023-08-25 | 2023-08-19 | 4.2 | 0.6 |
| Oslo | 2023-08-26 | 2023-08-20 | 4.2 | 0.6 |
| Oslo | 2023-08-27 | 2023-08-21 | 7.7 | 1.1 |
| Oslo | 2023-08-28 | 2023-08-22 | 7.7 | 1.1 |
| Oslo | 2023-08-29 | 2023-08-23 | 7.7 | 1.1 |
| Oslo | 2023-08-30 | 2023-08-24 | 13.8 | 2.0 |
| Oslo | 2023-08-31 | 2023-08-25 | 13.8 | 2.0 |
Additional context
To my knowledge, this isn't database-specific.
Who will this benefit?
Any place where a rolling window is needed, but the source table is sparse (missing records for certain combination of unique keys).
Are you interested in contributing this feature?
Yes