Skip to content

Pandas dataframes created with pd.read_sql using SQLAlchemy select can't be displayed #6861

@zaneselvans

Description

@zaneselvans

Describe the bug

I am sometimes encountering a behavior where when I try to display a pandas dataframe by entering the variable name and nothing else in a cell, I get no output and also no errors. If I run len(df) I can see that it has a few thousand rows.

If I run df.columns() I can see its columns:

Index(['metrics_date', 'path', 'title', 'unique_views'], dtype='object')

I thought that the fact that the column index was made of objects might be the problem, but other dataframes which don't have the display problem also have a column index composed of objects, so I guess not.

df.info() will show me a summary:

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4040 entries, 0 to 4039
Data columns (total 4 columns):
 #   Column        Non-Null Count  Dtype         
---  ------        --------------  -----         
 0   metrics_date  4040 non-null   datetime64[ns]
 1   path          4040 non-null   object        
 2   title         4040 non-null   object        
 3   unique_views  4040 non-null   int64         
dtypes: datetime64[ns](1), int64(1), object(2)
memory usage: 126.4+ KB

print(df) will show me a text rendition of the dataframe

     metrics_date                                               path  \
0      2023-08-22                         /catalyst-cooperative/pudl   
1      2023-08-22                   /catalyst-cooperative/pudl/pulls   
2      2023-08-22                  /catalyst-cooperative/pudl/issues   
3      2023-08-22             /catalyst-cooperative/pudl/issues/2755   
4      2023-08-22               /catalyst-cooperative/pudl/pull/2789   
...           ...                                                ...   
4035   2025-09-10  /catalyst-cooperative/pudl/blob/main/src/pudl/...   
4036   2025-09-10             /catalyst-cooperative/pudl/issues/1269   
4037   2025-09-10             /catalyst-cooperative/pudl/issues/1859   
4038   2025-09-10             /catalyst-cooperative/pudl/issues/2875   
4039   2025-09-10             /catalyst-cooperative/pudl/issues/3794   

                                                  title  unique_views  
0     GitHub - catalyst-cooperative/pudl: The Public...            80  
1     Pull requests · catalyst-cooperative/pudl · Gi...            11  
2           Issues · catalyst-cooperative/pudl · GitHub            18  
3     FERC Form 1 Statement of Income Contains Holes...             8  
4     Apply new naming convention to raw and core in...             2  
...                                                 ...           ...  
4035  pudl/src/pudl/transform/ferc714.py at main · c...             1  
4036  Consolidate EIA fuel type codes in metadata · ...             1  
4037  Phase II XBRL Extract Infrastructure · Issue #...             1  
4038  ValueError: cannot set a frame with no defined...             1  
4039  Connect local superset deployment to Cloud SQL...             1  

but marimo seems unable to display it natively.

If I wrap the dataframe in mo.ui.dataframe() then I do see some errors:

Image

The dataframe having this issue is being created with pd.read_sql() using SQLAlchemy column selectors. Here gh_paths is a SQLAlchemy Table read from an SQLAlchemy MetaData object.

gh_paths = metadata.tables["core_github_popular_paths"]
paths_full = pd.read_sql(
    sa.select(
        gh_paths.c.metrics_date,
        gh_paths.c.path,
        gh_paths.c.title,
        gh_paths.c.unique_views,
    ),
    engine,
)

At first I thought this might be related to the switch in marimo v0.17.0 to using narwahls for mo.ui.dataframe but I tested it in both v0.16.5 and v0.15.5 and am still getting the same behavior.

Will you submit a PR?

  • Yes

Environment

{
  "marimo": "0.17.0",
  "editable": false,
  "location": "/app/.venv/lib/python3.13/site-packages/marimo",
  "OS": "Linux",
  "OS Version": "6.15.11-orbstack-00542-g4f455d264886",
  "Processor": "",
  "Python Version": "3.13.9",
  "Locale": "en_US",
  "Binaries": {
    "Browser": "--",
    "Node": "--"
  },
  "Dependencies": {
    "click": "8.3.0",
    "docutils": "0.22.2",
    "itsdangerous": "2.2.0",
    "jedi": "0.19.2",
    "markdown": "3.9",
    "narwhals": "2.9.0",
    "packaging": "25.0",
    "psutil": "7.1.1",
    "pygments": "2.19.2",
    "pymdown-extensions": "10.16.1",
    "pyyaml": "6.0.3",
    "starlette": "0.48.0",
    "tomlkit": "0.13.3",
    "typing-extensions": "4.15.0",
    "uvicorn": "0.38.0",
    "websockets": "15.0.1"
  },
  "Optional Dependencies": {
    "pandas": "2.3.3",
    "polars": "1.34.0",
    "pyarrow": "21.0.0",
    "loro": "1.8.1",
    "python-lsp-server": "1.13.1",
    "ruff": "0.14.1"
  },
  "Experimental Flags": {}
}

Code to reproduce

This uses SQLite + SQLAlchemy. In my use case I am actually reading from CloudSQL, but the SQLite behavior is very similar. The only difference that I see is that in my real notebook, when I attempt to display the dataframe I get nothing being displayed at all, whereas in the example below it shows an empty dataframe of the same dimensions as the real dataframe.

# /// script
# requires-python = ">=3.13"
# dependencies = [
#     "pandas==2.3.3",
#     "sqlalchemy==2.0.44",
# ]
# ///

import marimo

__generated_with = "0.17.0"
app = marimo.App(width="full")


@app.cell
def _():
    import marimo as mo
    import pandas as pd
    import sqlalchemy as sa
    from datetime import datetime, timedelta
    from pathlib import Path

    # Create SQLite database with test table
    engine = sa.create_engine("sqlite:///:memory:")
    metadata = sa.MetaData()

    # Define test table similar to core_github_popular_paths
    test_table = sa.Table("test_github_paths", metadata, sa.Column("path", sa.String(100)))

    # Create the table
    metadata.create_all(engine)

    # Insert test data
    test_data = [{"path": "/pudl/issues/101"}]

    # Insert data using SQLAlchemy
    with engine.connect() as conn:
        conn.execute(sa.insert(test_table), test_data)
        conn.commit()

    # Read the dataframe from the test table
    test_df = pd.read_sql("test_github_paths", engine)
    return mo, pd, test_data, test_df


@app.cell
def _(test_df):
    # dataframe should be displayed, but no output appears.
    test_df
    return


@app.cell
def _(mo, test_df):
    # Explicitly using the dataframe UI element results in errors.
    mo.ui.dataframe(test_df)
    return


@app.cell
def _(test_df):
    test_df.info()
    return


@app.cell
def _(mo, test_df):
    # There's data in there, I swear!
    mo.Html(test_df.to_html())
    return


@app.cell
def _(pd, test_data):
    # The same test data used to create a Pandas dataframe directly displays fine.
    good_df = pd.DataFrame(test_data)
    good_df
    return (good_df,)


@app.cell
def _(good_df, pd, test_df):
    # Pandas seems to think the two dataframes are identical.
    pd.testing.assert_frame_equal(test_df, good_df)
    return


if __name__ == "__main__":
    app.run()

Environment for repoducible example

{
  "marimo": "0.17.0",
  "editable": false,
  "location": "/Users/zane/code/catalyst/metrics-dashboard/.venv/lib/python3.13/site-packages/marimo",
  "OS": "Darwin",
  "OS Version": "25.0.0",
  "Processor": "arm",
  "Python Version": "3.13.9",
  "Locale": "en_US",
  "Binaries": {
    "Browser": "141.0.7390.108",
    "Node": "v24.4.1"
  },
  "Dependencies": {
    "click": "8.3.0",
    "docutils": "0.22.2",
    "itsdangerous": "2.2.0",
    "jedi": "0.19.2",
    "markdown": "3.9",
    "narwhals": "2.9.0",
    "packaging": "25.0",
    "psutil": "7.1.1",
    "pygments": "2.19.2",
    "pymdown-extensions": "10.16.1",
    "pyyaml": "6.0.3",
    "starlette": "0.48.0",
    "tomlkit": "0.13.3",
    "typing-extensions": "4.15.0",
    "uvicorn": "0.38.0",
    "websockets": "15.0.1"
  },
  "Optional Dependencies": {
    "pandas": "2.3.3",
    "polars": "1.34.0",
    "pyarrow": "21.0.0",
    "loro": "1.8.1",
    "python-lsp-server": "1.13.1",
    "ruff": "0.14.1"
  },
  "Experimental Flags": {
    "sql_linter": true
  }
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions