-
Notifications
You must be signed in to change notification settings - Fork 467
Add a proposal process #513
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 3 commits
e0cc257
faf6b23
ae98d5d
3d0eb6f
edb8e82
51a4a3f
9cdce64
ef310c3
39c6fd9
7e3cb6a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,24 @@ | ||
| --- | ||
| Author: Julien Le Dem | ||
| Created: 2025-Aug-7 | ||
| Name: add BASE64 compression | ||
| Issue: https://github.com/apache/parquet-format/issues/NNN | ||
| Status: ARCHIVED | ||
| Reason: Did not compress | ||
| --- | ||
|
|
||
| # Proposal | ||
|
|
||
| ## Description | ||
| Add Base64 to compression algorithms. | ||
| This is not backwards compatible as a new compression alg. | ||
|
|
||
| ## Spec | ||
|
|
||
| See [BASE64 spec]. | ||
|
|
||
| ## Evaluation | ||
|
|
||
| After trying out in the java implementation, file size doubled on average. | ||
| See prototype [here](github.com/julienledem/mypoc) | ||
|
|
||
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -0,0 +1,50 @@ | ||||||
| # Proposals | ||||||
|
|
||||||
| ## Requirements | ||||||
|
|
||||||
| See the [requirements document](https://docs.google.com/document/d/1qGDnOyoNyPvcN4FCRhbZGAvp0SfewlWo-WVsai5IKUo/edit?tab=t.0#heading=h.v4emiipkghrx) (Note: this doc would become a markdown page in the repo) | ||||||
|
|
||||||
| ## Proposal lifecycle | ||||||
|
|
||||||
| Discuss -> Draft -> POC -> Approved -> Implementation -> Release | ||||||
|
||||||
| Discuss -> Draft -> POC -> Approved -> Implementation -> Release | |
| Discuss -> Draft / POC -> Implementations -> Approval |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
I also updated the wording and clarified transition to remove any hint of gatekeeping (except for final approval for release)
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps you can link to some examples, such as
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What purpose does the separate mark-down doc intended to serve? Per @alamb 's point above we haven't had ones for recent changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general this seems fine to me as it probably helps developers do better archeology.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was my idea.
We can also just make this a table that points to github issues given a pre-defined template?
That might be easier.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I worry a little bit about the friction of contributors keeping the markdown up-to-date? Maybe keeping this in the google doc? But we can see how it works in practice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have changed the wording to remove this as a requirement. I have left it as an option if people want to have a perennial place to save documents.
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend combining the Draft / POC stage -- it seems like any change also has a POC that shows what the proposal looks like in code so it makes more sense to me if they are combined
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend combining "Approved for Implementation" and "Implementation" -- and highlighting that at this stage you will have convinced someone to implement the proposal in at least one other open source implementation and that showing at least those two implementations are interoperable
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| At this stage we need to meet the contribution guidelines to confsider the implementation finished (ex: two independent implementations with cross compatibility tests, spec updated, ...) | |
| At this stage we need to meet the contribution guidelines to consider the implementation finished (ex: two independent implementations with cross compatibility tests, spec updated, ...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a nice way to list / find the active proposal
julienledem marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,22 @@ | ||
| # Proposal | ||
|
|
||
| --- | ||
| Author: ~your name~ | ||
| Created: ~date~ | ||
| Name: *short sentence describing the proposal* | ||
| Issue: https://github.com/apache/parquet-format/issues/NNN | ||
| Status: DRAFT|POC|ACCEPTED|COMPLETED | ||
| --- | ||
|
|
||
| ## Description | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I recommend also adding a "## Rationale" section: Describe why this is a feature that will improve the parquet format and what alternatives currently exist for the usecase (e.g. must use a different format, or "must build additional infrastructure to avoid re-parsing footer on each query", or "must use a general purpose compression algorithm to achieve the same space, thus slowing down query performance)
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. added! |
||
| *Short description of the proposal. Is it a new encoding? Is it backwards compatible (old readers will just ignore it)? Is it additional metadata?* | ||
|
|
||
| ## Spec | ||
|
|
||
| At the proposal stage you don't need a fully fleshed out spec yet. | ||
| Please add any link to relevant documentation, papers, etc. | ||
| at the implementation stage, the details will need to be all clarified. | ||
|
|
||
| ## Evaluation | ||
| What datasets is it tested on and what is a success criteria | ||
| Please add any link to the relevant codebase. | ||
Uh oh!
There was an error while loading. Please reload this page.