Skip to content

Conversation

@Light2Dark
Copy link
Contributor

@Light2Dark Light2Dark commented Oct 23, 2025

📝 Summary

Some of these changes are opinionated so feel free to disagree.

  • increase chart width from 70 -> 80
  • removes feature flag. This is now the default. Remove legacy vega spec and code.
  • string charts <20k rows now show charts
  • stats for escaped column names were missing
  • very thin bars for temporal columns. changed to even-width bars
  • better hover visuals

before:
image

after:

CleanShot.2025-10-24.at.11.37.56.mp4

When using temporal type, the bar widths are proportioned to the values. this can make some bars have very small widths. The fix is to use ordinal. Additionally, there is better hovering behaviour.

🔍 Description of Changes

📋 Checklist

  • I have read the contributor guidelines.
  • For large changes, or changes that affect the public API: this change was discussed or approved through an issue, on Discord, or the community discussions (Please provide a link if applicable).
  • I have added tests for the changes made.
  • I have run the code and verified that it works as expected.

@vercel
Copy link

vercel bot commented Oct 23, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
marimo-docs Ready Ready Preview Comment Oct 29, 2025 2:14am

@Light2Dark Light2Dark marked this pull request as draft October 23, 2025 08:52
mscolnick
mscolnick previously approved these changes Oct 23, 2025
Copy link
Contributor

@akshayka akshayka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The desired behavior for a histogram is that each bin represents the same range. Is that what is meant by

When using temporal type, the bar widths are proportioned to the values.

and if so, then we shouldn't make this change. But I agree that the plots in the "before" image look ugly.

Can you explain in more detail what the original problem is and how this change fixes it? I don't quite understand.

@akshayka
Copy link
Contributor

@Light2Dark , additionally can you provide a sample notebook (in smoke_tests) where I can look at the before and after behavior?

@Light2Dark
Copy link
Contributor Author

Light2Dark commented Oct 23, 2025

The desired behavior for a histogram is that each bin represents the same range. Is that what is meant by

When using temporal type, the bar widths are proportioned to the values.

and if so, then we shouldn't make this change. But I agree that the plots in the "before" image look ugly.

Can you explain in more detail what the original problem is and how this change fixes it? I don't quite understand.

I should preface by saying this is a bug when turning on performant_table_charts feature flag. We use temporal type for the chart which causes the bar width to be proportioned to the range. If using ordinal type, the bar widths are the same regardless of the duration they occupy.

I like the proportioned bars idea, but it looks ugly when one bar takes up a lot of space. This alternative, I think simplest and matches non feature flag behaviour. I'm not sure if narwhals v2 has changed anything to offer better alternatives for temporal charting.

will create a sample notebook!

@akshayka
Copy link
Contributor

Thanks so much for the additional context!

I should preface by saying this is a bug when turning on performant_table_charts feature flag.

Got it.

We use temporal type for the chart which causes the bar width to be proportioned to the range. If using ordinal type, the bar widths are the same regardless of the duration they occupy.

I like the proportioned bars idea, but it looks ugly when one bar takes up a lot of space.

For histograms (and anything in our product conveying data / data distributions), we should prioritize correctness over aesthetics; that said I'm happy to look at the sample notebook, that'll make this a lot more concrete for me.

will create a sample notebook!

Thank you!

@Light2Dark Light2Dark changed the title fix time charts having too small bars, and better opacity hover improve performant table charts, remove feat flag arguments Oct 24, 2025
@Light2Dark Light2Dark added the bash-focus Area to focus on during release bug bash label Oct 24, 2025
@Light2Dark Light2Dark changed the title improve performant table charts, remove feat flag arguments improve performant table charts, remove feat flag Oct 24, 2025
@Light2Dark
Copy link
Contributor Author

I created a notebook here: marimo/_smoke_tests/tables/column-header-chart.py Made a decision to increase the chart width but feel free to disagree.

I think we can get this in for the following release (not today)

@akshayka
Copy link
Contributor

I created a notebook here: marimo/_smoke_tests/tables/column-header-chart.py Made a decision to increase the chart width but feel free to disagree.

I think we can get this in for the following release (not today)

Taking a look, thank you for putting this together

@akshayka
Copy link
Contributor

@Light2Dark

I compared 0.17.2, with performant table charts disabled, against this PR. Can you help me understand how performant table charts treat numerical columns when drawing histograms? It seems like the bin width can be variable?

For example, consider the following data:

data = [1, 1, 1, 1, 1, 2, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5,  8, 9, 10] 
df = pl.DataFrame({"data": data})
df

(A) 0.17.2, without performant table charts.

image

(B) This PR/performant table charts.

image

It looks like (B) doesn't actually represent the correct data distribution? The 1-2 and 2-3 bars don't like right? When I mouse over the bins, it seems to be because bin widths/whether to include the boundary of a bin is inconsistent in (B), but I'm not sure why?

Before removing the feature flag we should make sure that we don't distort data distributions. In my opinion histograms should always have a single fixed bin width. If I am misunderstanding something please let me know!

@Light2Dark
Copy link
Contributor Author

Light2Dark commented Oct 27, 2025

Hmm, it shouldn't be inconsistent 🤔. This is the raw bin calculation

[
  "BinValue(bin_start=1, bin_end=2.0, count=6)",
  "BinValue(bin_start=2.0, bin_end=3.0, count=5)",
  "BinValue(bin_start=3.0, bin_end=4.0, count=6)",
  "BinValue(bin_start=4.0, bin_end=5.0, count=8)",
  "BinValue(bin_start=5.0, bin_end=6.0, count=0)",
  "BinValue(bin_start=6.0, bin_end=7.0, count=0)",
  "BinValue(bin_start=7.0, bin_end=8.0, count=1)",
  "BinValue(bin_start=8.0, bin_end=9.0, count=1)",
  "BinValue(bin_start=9.0, bin_end=10.0, count=1)"
]

Looks like narwhals uses left-inclusive, right-inclusive for the first bin (to capture the minimum), and then left-inclusive, right exclusive for the rest. So, 1-2 has six values. But 2-3 doesn't include the last 3 so five values.

python code:

data = [1, 1, 1, 1, 1, 2, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5,  8, 9, 10] 
df = pl.DataFrame({"data": data})

from marimo._plugins.ui._impl.tables.narwhals_table import NarwhalsTableManager

DEFAULT_BIN_SIZE = 9
manager = NarwhalsTableManager.from_dataframe(df)
manager.get_bin_values("data", DEFAULT_BIN_SIZE)

marking as draft because I realize the narwhals hist API is marked unstable

@Light2Dark Light2Dark marked this pull request as draft October 27, 2025 17:18
@akshayka
Copy link
Contributor

Looks like narwhals uses left-inclusive, right-inclusive for the first bin (to capture the minimum), and then left-inclusive, right exclusive for the rest. So, 1-2 has six values. But 2-3 doesn't include the last 3 so five values.

Got it, thank you for explaining. That should be okay then, really appreciate the explanation.

marking as draft because I realize the narwhals hist API is marked unstable

@Light2Dark, want to make an issue on the Narwhals repo, explain our use case, and ask Marco to what extent we can rely on the API, or if he can provide us a semi-private stable API? Hopefully we can find a solution

@Light2Dark
Copy link
Contributor Author

Light2Dark commented Oct 27, 2025

Looks like narwhals uses left-inclusive, right-inclusive for the first bin (to capture the minimum), and then left-inclusive, right exclusive for the rest. So, 1-2 has six values. But 2-3 doesn't include the last 3 so five values.

Got it, thank you for explaining. That should be okay then, really appreciate the explanation.

marking as draft because I realize the narwhals hist API is marked unstable

@Light2Dark, want to make an issue on the Narwhals repo, explain our use case, and ask Marco to what extent we can rely on the API, or if he can provide us a semi-private stable API? Hopefully we can find a solution

I could do that, we can also fallback to our legacy spec and handling. I removed it because it was cleaner. I think fine to fallback.

edit, actually we lose some perf by fallback. let me check with narwhals. narwhals-dev/narwhals#3249

Light2Dark and others added 8 commits October 29, 2025 09:58
…6895)

<!--
Provide a concise summary of what this pull request is addressing.

If this PR fixes any issues, list them here by number (e.g., Fixes
-->
Removes parameters used for performant_table_charts. We can use it by
default. Removes old data param.
Also enables string charts for <20k rows which imo is reasonable.

![CleanShot 2025-10-23 at 16 26
25](https://github.com/user-attachments/assets/662b01b6-1396-4f96-9097-fa0053c8bee4)

<!--
Detail the specific changes made in this pull request. Explain the
problem addressed and how it was resolved. If applicable, provide before
and after comparisons, screenshots, or any relevant details to help
reviewers understand the changes easily.
-->

- [x] I have read the [contributor
guidelines](https://github.com/marimo-team/marimo/blob/main/CONTRIBUTING.md).
- [ ] For large changes, or changes that affect the public API: this
change was discussed or approved through an issue, on
[Discord](https://marimo.io/discord?ref=pr), or the community
[discussions](https://github.com/marimo-team/marimo/discussions) (Please
provide a link if applicable).
- [ ] I have added tests for the changes made.
- [x] I have run the code and verified that it works as expected.
<!--
Provide a concise summary of what this pull request is addressing.

If this PR fixes any issues, list them here by number (e.g., Fixes
-->
For narwhals, get_bin_values is marked as unstable. So, we should
fallback to legacy data and spec handling.

https://narwhals-dev.github.io/narwhals/api-reference/series/#narwhals.series.Series.hist

<!--
Detail the specific changes made in this pull request. Explain the
problem addressed and how it was resolved. If applicable, provide before
and after comparisons, screenshots, or any relevant details to help
reviewers understand the changes easily.
-->

- [x] I have read the [contributor
guidelines](https://github.com/marimo-team/marimo/blob/main/CONTRIBUTING.md).
- [ ] For large changes, or changes that affect the public API: this
change was discussed or approved through an issue, on
[Discord](https://marimo.io/discord?ref=pr), or the community
[discussions](https://github.com/marimo-team/marimo/discussions) (Please
provide a link if applicable).
- [x] I have added tests for the changes made.
- [x] I have run the code and verified that it works as expected.
@dataclass
class ColumnSummaries:
# If precomputed aggregations fail, we fallback to chart data
data: Union[JSONType, str]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this need to be optional? or would the json type be empty?

Copy link
Contributor Author

@Light2Dark Light2Dark Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, don't need to, JSONType has None:

    JSONType: TypeAlias = Union[
        Mapping[str, "JSONType"],
        Sequence["JSONType"],
        str,
        int,
        float,
        bool,
        object,
        MIME,  # MIME is a JSONType since we have a custom JSONEncoder for it
        None,
    ]

@Light2Dark Light2Dark marked this pull request as ready for review October 29, 2025 02:40
Copy link
Contributor

@mscolnick mscolnick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice work!

@Light2Dark
Copy link
Contributor Author

merging this in, can test during bug bash

@Light2Dark Light2Dark merged commit 15b793a into main Oct 29, 2025
34 of 48 checks passed
@Light2Dark Light2Dark deleted the sham/fix-time-chart-width-and-opacity branch October 29, 2025 13:01
@Light2Dark Light2Dark restored the sham/fix-time-chart-width-and-opacity branch October 30, 2025 17:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bash-focus Area to focus on during release bug bash enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants