Skip to content

Commit 3aa43e7

Browse files
committed
Docs
1 parent 0e241dd commit 3aa43e7

File tree

2 files changed

+17
-0
lines changed

2 files changed

+17
-0
lines changed

docs/source/loading.mdx

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -220,6 +220,22 @@ Load a list of Python dictionaries with [`~Dataset.from_list`]:
220220
>>> dataset = Dataset.from_list(my_list)
221221
```
222222

223+
### Python generator
224+
225+
Create a dataset from a Python generator with [`~Dataset.from_generator`]
226+
227+
```py
228+
>>> from datasets import Dataset
229+
>>> def my_gen():
230+
... yield {"a": 1}
231+
... yield {"a": 2}
232+
... yield {"a": 3}
233+
...
234+
>>> dataset = Dataset.from_generator(my_dict)
235+
```
236+
237+
This approach supports loading data larger than available memory.
238+
223239
### Pandas DataFrame
224240

225241
Load Pandas DataFrames with [`~Dataset.from_pandas`]:

docs/source/package_reference/main_classes.mdx

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ The base class [`Dataset`] implements a Dataset backed by an Apache Arrow table.
1616
- from_buffer
1717
- from_pandas
1818
- from_dict
19+
- from_generator
1920
- data
2021
- cache_files
2122
- num_columns

0 commit comments

Comments
 (0)