[BUG] `select from table limit` reads the full dataset and persists in memory. 

**What happened**:
When performing a `SELECT * FROM table LIMIT 10`, from a table read in via parquet, I notice the full dataset being read and persisted on query execution.

**What you expected to happen**:
Nothing to happen at query execution and when the user does decide to persist/compute the result only the relevant subset of data is read in. 

**Minimal Complete Verifiable Example**:

```python
from dask_cuda import LocalCUDACluster
from distributed import Client, wait
import cudf
import dask_cudf
from dask_sql import Context
import dask

write_data = False

if __name__ == "__main__":
    cluster = LocalCUDACluster()
    client = Client(cluster)
    c = Context()
    
    if write_data:
        dask.datasets.timseries(start="2022-01-01", end="2024-01-01").to_parquet("test_data.parquet")


    ddf = dask_cudf.read_parquet("test_data.parquet")
    c.create_table("test", ddf, persist=False)

    # This results in the whole dataset persisted in memory and even though `len(res)==5` all the data is in memory
    res = c.sql("SELECT * from test LIMIT 10")
```

**Anything else we need to know?**:

**Environment**:

- dask-sql version: 2022.1.0
- Python version: 3.8
- Operating System: ubuntu 18.04
- Install method (conda, pip, source): conda

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] `select from table limit` reads the full dataset and persists in memory. #385

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG] select from table limit reads the full dataset and persists in memory. #385

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[BUG] `select from table limit` reads the full dataset and persists in memory. #385