Skip to content

More elegant way to implement todense function.#80

Merged
kayibal merged 1 commit intogetitem-fixesfrom
getitem-fixes-ah
Jun 11, 2019
Merged

More elegant way to implement todense function.#80
kayibal merged 1 commit intogetitem-fixesfrom
getitem-fixes-ah

Conversation

@kayibal
Copy link
Copy Markdown

@kayibal kayibal commented Jun 8, 2019

This leverages the dask.delayed object api to achieve the same result
which was previously a hack between map_partitions and initializing
dd.DataFrame directy.

This leverages the dask.delayed object api to achieve the same result
which was previously a hack between map_partitions and initializing
dd.DataFrame directy.
@kayibal kayibal requested a review from michcio1234 June 8, 2019 22:22
@kayibal kayibal merged commit 4c63a0c into getitem-fixes Jun 11, 2019
@kayibal kayibal deleted the getitem-fixes-ah branch June 11, 2019 12:08
kayibal pushed a commit that referenced this pull request Jun 14, 2019
* Check for type of meta in `apply_and_enforce`

It was possible that although computed type is SparseFrame, other type
is returned (if meta was not a SparseFrame).

Imports are not changed, just reorganized.

* Simple __getindex__ for dask SparseFrames.

Support for dsp[index] syntax. Doesn't aim to work the same as in
pandas, just maps __getitem__ onto partitions.

* Add getitem test with empty frame

* todense() returns Series when there is one empty column

Previously it returned DataFrame, even though in case of 1-column
non-empty SparseFrame, it returned Series.

Imports are only re-organized.

* Add .todense() method to Dask SparseFrame

It works by mapping SparseFrame.todense onto partitions.
It as necessary to allow `map_partitions` to return other types
then SparseFrame, so kwarg `cls` was added. It implies that one cannot
use `cls` kwarg as an argument to mapped function (because it will be
consumed by `map_partitions` and not passed to a mapped function).

* Support reindex in case of empty frame

* More elegant way to implement todense function. (#80)

This leverages the dask.delayed object api to achieve the same result
which was previously a hack between map_partitions and initializing
dd.DataFrame directy.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant