More elegant way to implement todense function.#80
Merged
kayibal merged 1 commit intogetitem-fixesfrom Jun 11, 2019
Merged
Conversation
This leverages the dask.delayed object api to achieve the same result which was previously a hack between map_partitions and initializing dd.DataFrame directy.
kayibal
pushed a commit
that referenced
this pull request
Jun 14, 2019
* Check for type of meta in `apply_and_enforce` It was possible that although computed type is SparseFrame, other type is returned (if meta was not a SparseFrame). Imports are not changed, just reorganized. * Simple __getindex__ for dask SparseFrames. Support for dsp[index] syntax. Doesn't aim to work the same as in pandas, just maps __getitem__ onto partitions. * Add getitem test with empty frame * todense() returns Series when there is one empty column Previously it returned DataFrame, even though in case of 1-column non-empty SparseFrame, it returned Series. Imports are only re-organized. * Add .todense() method to Dask SparseFrame It works by mapping SparseFrame.todense onto partitions. It as necessary to allow `map_partitions` to return other types then SparseFrame, so kwarg `cls` was added. It implies that one cannot use `cls` kwarg as an argument to mapped function (because it will be consumed by `map_partitions` and not passed to a mapped function). * Support reindex in case of empty frame * More elegant way to implement todense function. (#80) This leverages the dask.delayed object api to achieve the same result which was previously a hack between map_partitions and initializing dd.DataFrame directy.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This leverages the dask.delayed object api to achieve the same result
which was previously a hack between map_partitions and initializing
dd.DataFrame directy.