refactor(rust): Adjust size of `dt.*` methods to Int64 by dylanfair · Pull Request #27411 · pola-rs/polars

dylanfair · 2026-04-25T23:42:20Z

Addresses #27162

No AI was used with this PR.

Local tests:

Wasn't sure what Angular convention to use so I landed on "refactor" since this isn't exactly a new feature nor an explicit bug fix, but happy to adjust the title accordingly.

Changes made

This PR adjusts the following dt.* methods to return an i64 value in order to minimize easy overflows when using them in expressions:

dt.hour()
dt.minute()
dt.second()
dt.millisecond()
dt.microsecond()
dt.nanosecond()
dt.year()
dt.iso_year()
dt.month()
dt.day()
dt.days_in_month()
dt.ordinal_day()
dt.week()
dt.weekday()
dt.millennium()
dt.century()

As it currently stands the following MRE which showcases dt.hour() used within an expression will overflow like so:

import datetime
import polars as pl

df = pl.DataFrame({"timestamp": [datetime.datetime(2026, 1, 1, 4, 35)]})
df = df.with_columns(
    [
        (pl.col("timestamp").dt.hour().mul(60) + pl.col("timestamp").dt.hour()).alias(
            "calculate"
        ),
    ]
)

print(df)

shape: (1, 2)
┌─────────────────────┬───────────┐
│ timestamp           ┆ calculate │
│ ---                 ┆ ---       │
│ datetime[μs]        ┆ i8        │
╞═════════════════════╪═══════════╡
│ 2026-01-01 04:35:00 ┆ -12       │
└─────────────────────┴───────────┘

Now we get the following result:

shape: (1, 2)
┌─────────────────────┬───────────┐
│ timestamp           ┆ calculate │
│ ---                 ┆ ---       │
│ datetime[μs]        ┆ i64       │
╞═════════════════════╪═══════════╡
│ 2026-01-01 04:35:00 ┆ 244       │
└─────────────────────┴───────────┘

Alternatives

I'll admit a more nuanced approach such as checking for an overflow and upcasting accordingly might be more appropriate. I'm not nearly familiar enough with the codebase yet to know how to implement that, so I'm happy to defer to others if that's the better direction. I'm sure there are performance implications there as well that I'm fully unaware of even if I were to attempt it.

Another alternative might be to instead raise some kind of error if the overflow occurs, and prompt the user to upcast themselves like:

df = df.with_columns(
    [
        (
            pl.col("timestamp").dt.hour().cast(pl.Int64).mul(60)
            + pl.col("timestamp").dt.hour().cast(pl.Int64)
        )
        .cast(pl.Int64)
        .alias("calculate"),
    ]
)
print(df)

As it currently stands I think it's too easy for someone to not even realize the overflow has occurred if they are running an expression over a dataset with thousands of rows. This is a valid sharp edge someone can land on as mentioned in the issue. The lack of knowledge that the overflow has happened can be fairly problematic as the data at that point is no longer accurate, so any analysis using that data can draw incorrect conclusions.

codecov · 2026-04-26T04:08:12Z

Codecov Report

❌ Patch coverage is 96.77419% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 81.19%. Comparing base (693c520) to head (a50afaf).

Files with missing lines	Patch %	Lines
crates/polars-time/src/chunkedarray/kernels.rs	50.00%	4 Missing ⚠️
crates/polars-time/src/series/mod.rs	95.23%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main   #27411      +/-   ##
==========================================
- Coverage   81.19%   81.19%   -0.01%     
==========================================
  Files        1832     1832              
  Lines      253247   253245       -2     
  Branches     3176     3176              
==========================================
- Hits       205626   205613      -13     
- Misses      46798    46809      +11     
  Partials      823      823

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

dylanfair · 2026-04-26T04:39:09Z

After looking more at the get_field() method in crates/polars-plan/src/plans/aexpr/function_expr/datetime.rs, I'm thinking it might not be necessary to alter the dtypes of the dt.* methods themselves. Just making changes there and nowhere else seems to cast expressions that use the dt.* methods into i64, leaving something like this with our original example:

shape: (1, 3)
┌─────────────────────┬──────┬───────────┐
│ timestamp           ┆ hour ┆ calculate │
│ ---                 ┆ ---  ┆ ---       │
│ datetime[μs]        ┆ i8   ┆ i64       │
╞═════════════════════╪══════╪═══════════╡
│ 2026-01-01 04:35:00 ┆ 4    ┆ 244       │
└─────────────────────┴──────┴───────────┘

However I'm seeing that without adjusting the dtypes of the dt.* methods to match (like they are in this PR), something like this then starts to error:

df = df.with_columns(
    [
        (pl.col("timestamp").dt.hour() // 10 * 10).alias("errors"),
    ]
)

Which displays the following:

The application panicked (crashed).
Message:  implementation error, cannot get ref Int8 from Int64
Location: crates/polars-core/src/series/mod.rs:1125

Backtrace omitted.

Run with RUST_BACKTRACE=1 environment variable to display it.
Run with RUST_BACKTRACE=full to include source snippets.
Traceback (most recent call last):
  File "/home/dylanfair/code_projects/polars/sandbox.py", line 16, in <module>
    df = df.with_columns(
         ^^^^^^^^^^^^^^^^
  File "/home/dylanfair/code_projects/polars/py-polars/src/polars/dataframe/frame.py", line 10515, in with_columns
    .collect(optimizations=QueryOptFlags._eager())
     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dylanfair/code_projects/polars/py-polars/src/polars/_utils/deprecation.py", line 97, in wrapper
    return function(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dylanfair/code_projects/polars/py-polars/src/polars/lazyframe/opt_flags.py", line 343, in wrapper
    return function(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dylanfair/code_projects/polars/py-polars/src/polars/lazyframe/frame.py", line 2510, in collect
    return wrap_df(ldf.collect(engine, callback))
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pyo3_runtime.PanicException: implementation error, cannot get ref Int8 from Int64

This seems to be the more correct string to start pulling so as to try and keep the dt.* methods with their current dtypes while relaxing the dtype that comes out of their use in expressions - I might start looking into that instead.

dylanfair · 2026-04-26T20:57:13Z

Closing this pre-emptively as I think this is a heavy-handed approach to the problem and could produce backwards compatible issues for those with existing data pipelines (i.e. unexpected memory spike from i8 -> i64 jump).

dylanfair added 7 commits April 4, 2026 17:08

dt.hour() returns i64 instead of i8

f9e7e33

merge in latest from main

2de15e3

changed dtypes for several, working through test fails

494a573

fixed tests

4d15dd1

updated docstrings in datetime

82a246b

clippy fixes

fcff1f3

changed traits to i64

dd0172e

dylanfair requested review from MarcoGorelli, alexander-beedie, c-peters, orlp and ritchie46 as code owners April 25, 2026 23:42

github-actions Bot added internal An internal refactor or improvement rust Related to Rust Polars first-contribution First contribution by user labels Apr 25, 2026

dylanfair added 2 commits April 25, 2026 20:19

resolve missed python doc tests

bc13738

fixed issue with group_by plans

a50afaf

dylanfair closed this Apr 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(rust): Adjust size of `dt.*` methods to Int64#27411

refactor(rust): Adjust size of `dt.*` methods to Int64#27411
dylanfair wants to merge 9 commits intopola-rs:mainfrom
dylanfair:datetime-i64

dylanfair commented Apr 25, 2026

Uh oh!

codecov Bot commented Apr 26, 2026

Uh oh!

dylanfair commented Apr 26, 2026

Uh oh!

dylanfair commented Apr 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dylanfair commented Apr 25, 2026

Changes made

Alternatives

Uh oh!

codecov Bot commented Apr 26, 2026

Codecov Report

Uh oh!

dylanfair commented Apr 26, 2026

Uh oh!

dylanfair commented Apr 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant