Skip to content

Conversation

@WeichenXu123
Copy link
Collaborator

@WeichenXu123 WeichenXu123 commented Mar 23, 2020

  • Preserve spark dataframe schema order when create petastorm dataset/dataloader
  • Add selected fields for TransformSpec

@WeichenXu123 WeichenXu123 changed the title [WIP][ML-10118] Keep petastorm dataset/dataloader schema fields order the same with spark dataframe [ML-10118] Preserve spark dataframe schema order when create petastorm dataset/dataloader Mar 26, 2020
@codecov
Copy link

codecov bot commented Mar 26, 2020

Codecov Report

Merging #513 into master will increase coverage by 0.10%.
The diff coverage is 77.77%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #513      +/-   ##
==========================================
+ Coverage   86.02%   86.13%   +0.10%     
==========================================
  Files          81       81              
  Lines        4402     4435      +33     
  Branches      704      713       +9     
==========================================
+ Hits         3787     3820      +33     
  Misses        504      504              
  Partials      111      111              
Impacted Files Coverage Δ
petastorm/reader.py 90.99% <ø> (ø)
petastorm/transform.py 85.18% <60.00%> (-14.82%) ⬇️
petastorm/unischema.py 94.71% <100.00%> (+0.13%) ⬆️
petastorm/spark/spark_dataset_converter.py 92.73% <0.00%> (+2.11%) ⬆️
petastorm/pytorch.py 92.68% <0.00%> (+2.43%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0b70510...08329f1. Read the comment docs.

Copy link
Collaborator

@liangz1 liangz1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! I left some questions for discussion.

@selitvin selitvin self-requested a review March 26, 2020 19:57
Copy link
Collaborator

@selitvin selitvin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@liangz1 liangz1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@selitvin selitvin merged commit 4a99b9b into uber:master Mar 30, 2020
tkakantousis pushed a commit to logicalclocks/petastorm that referenced this pull request Sep 16, 2020
…m dataset/dataloader (uber#513)

Preserve spark dataframe schema order when create petastorm dataset/dataloader
Add selected fields for TransformSpec
kashishmittal55 pushed a commit to kashishmittal55/petastorm that referenced this pull request Aug 15, 2025
…m dataset/dataloader (uber#513)

Preserve spark dataframe schema order when create petastorm dataset/dataloader
Add selected fields for TransformSpec
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants