-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-13233][SQL][WIP] Python Dataset #11117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from 4 commits
Commits
Show all changes
17 commits
Select commit
Hold shift + click to select a range
6107495
python dataset
cloud-fan 15fd836
code cleanup
cloud-fan a0a0dd6
scala side cleanup
cloud-fan 6c26daa
fix style
cloud-fan d96f103
produce unsafe rows
cloud-fan 4dfe604
infer schema
cloud-fan e0ca98f
aggregate
cloud-fan da77adc
improve aggregate
cloud-fan a772492
fix style
cloud-fan 590308a
add pivot
cloud-fan c883fa6
some more tests
cloud-fan df53348
minor fix
cloud-fan 97dcac2
add import
cloud-fan 349b119
fix python 3
cloud-fan 8c32d31
small fix
cloud-fan aec6fc4
update
cloud-fan 1095d7f
small cleanup
cloud-fan File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid copying the bytes, here I create safe rows. However, according to #10511, operators should always produce unsafe rows. Actually python UDF operator(
BatchPythonEvaluation) also produce safe rows, which may also have problems. Should we bring back therequireUnsafeRowstuff? In some cases like here, converting to unsafe rows is expensive and may not have much benefit.cc @davies
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BatchPythonEvaluation will produce UnsafeRow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh sorry, I missed the unsafe projection at the very last. Then we can probably add an unsafe projection here too.