Enhancement/validations on update by janrth · Pull Request #541 · Nixtla/mlforecast

janrth · 2025-12-14T19:02:54Z

Tries to solve #358

Validates that the update df has the expected shape so that each unique_id starts from the last ds as seen in the previous df and contains the expected number of ds.

For each unique_id the number of ds date points are counted from the observed update df and this is then compared to the expected number of date points, which is calculated by the estimated start and end date. The estimated start date is observed based on the stored series last date + offset(freq).

There is an option to turn off the validate_input step. While overall the performance is pretty fast, it might be a bit annoying if one has hundreds of millions of rows.

Initially I started just checking if the first ds of the update is in the future for each unique_id, but then I felt this is not checking much really and started to implement a stronger logic. The issue itself is a bit vague and I am open for any changes as I implemented based on my interpretation of the task.

Description

Tries to implement more checks on update df

Checklist:

This PR has a meaningful title and a clear description.
The tests pass.
All linting tasks pass.
The notebooks are clean.

janrth · 2025-12-14T19:04:22Z

@jmoralez I tried to tackle #358

codspeed-hq · 2025-12-28T20:17:37Z

Merging this PR will not alter performance

✅ 12 untouched benchmarks

_{Comparing janrth:enhancement/validations_on_update (f4cc87e) with main (e1f281a)}

nasaul

This is a good start for this issue, however it requires some more thinking about what validation is actually being done in the function.

Should we validate on the aggregate level or on a individual series approach? If we are doing on an individual series we should be using something like:

for uid in df[self.id_col].unique():
      if uid in self.uids:
          expected_start = self.last_dates[uid] + offset(self.freq)
          actual_start = df[df[self.id_col] == uid][self.time_col].min()
          if actual_start != expected_start:
              raise ValueError(f"Series {uid} starts at {actual_start}, expected {expected_start}")

I think that we should focus right on the tests and build up functionality from there, the test should include:

Valid continuous updates
Invalid gaps in data
Invalid starting dates
New series
Different frequencies
Both pandas and polars DataFrames

Feel free to discuss if the proposed test are good enough or if we should focus on other validations also.

nasaul

Overall looks good, however in order to merge you should address the following:

Polars categorical encoding mismatch (core.py) - All 5 Polars tests fail. The join operation fails because the categorical columns have different encodings. Needs string casting before join, like the pandas branch does.
Type hint is wrong (core.py) - Says pd.DataFrame but should be DataFrame since it handles both pandas and polars
Add docstring for validate_input parameter in both both forecast.py and core.py

…m/janrth/mlforecast into enhancement/validations_on_update t pull

nasaul

Hey Jan, I've actually updated the tests in order to capture different frequencies and the solution doesn't hold. We have to use the offset from utilsforecast ufp.offset_times in order to make it work.

…ate update function

nasaul

LGTM

janrth and others added 3 commits December 12, 2025 00:13

validates format of update df

2a989a2

validates format of update df

e4ebeca

Merge branch 'Nixtla:main' into enhancement/validations_on_update

ea15e44

rebase

8304450

janrth and others added 3 commits January 3, 2026 00:11

Merge branch 'main' into enhancement/validations_on_update

770f934

fix type annotations for bool

e2c13f0

fixing type annotation in forecast.py

cd774d8

nasaul requested changes Jan 5, 2026

View reviewed changes

Comment thread mlforecast/core.py Outdated

Comment thread mlforecast/core.py Outdated

Comment thread mlforecast/core.py Outdated

Comment thread mlforecast/core.py

Comment thread mlforecast/core.py

janrth and others added 2 commits January 13, 2026 21:58

add full validations for update function plus tests

cd6cd48

Merge branch 'main' into enhancement/validations_on_update

f80651d

nasaul reviewed Jan 14, 2026

View reviewed changes

janrth and others added 3 commits January 15, 2026 15:41

fix type annotations and pl categorical encoding bug

376ef17

Merge branch 'enhancement/validations_on_update' of https://github.co…

42f88ca

…m/janrth/mlforecast into enhancement/validations_on_update t pull

Adds tests for more frequencies

8a44db1

nasaul reviewed Jan 16, 2026

View reviewed changes

using offset_times from utilsforecast in core functionality for valid…

f4cc87e

…ate update function

nasaul approved these changes Jan 19, 2026

View reviewed changes

nasaul merged commit ea35835 into Nixtla:main Jan 19, 2026
21 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhancement/validations on update#541

Enhancement/validations on update#541
nasaul merged 13 commits intoNixtla:mainfrom
janrth:enhancement/validations_on_update

janrth commented Dec 14, 2025 •

edited

Loading

Uh oh!

janrth commented Dec 14, 2025

Uh oh!

codspeed-hq Bot commented Dec 28, 2025 •

edited

Loading

Uh oh!

nasaul left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nasaul left a comment

Uh oh!

nasaul left a comment

Uh oh!

nasaul left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

janrth commented Dec 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Uh oh!

janrth commented Dec 14, 2025

Uh oh!

codspeed-hq Bot commented Dec 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will not alter performance

Uh oh!

nasaul left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nasaul left a comment

Choose a reason for hiding this comment

Uh oh!

nasaul left a comment

Choose a reason for hiding this comment

Uh oh!

nasaul left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

janrth commented Dec 14, 2025 •

edited

Loading

codspeed-hq Bot commented Dec 28, 2025 •

edited

Loading