Skip to content

Conversation

@julienledem
Copy link
Member

Rationale for this change

We need a better way to propose changes

What changes are included in this PR?

adding a process to the repo.

Do these changes have PoC implementations?

yes

@alamb
Copy link
Contributor

alamb commented Aug 11, 2025

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks like a great start to me


## Proposal lifecycle

Discuss -> Draft -> POC -> Approved -> Implementation -> Release
Copy link
Contributor

@alamb alamb Aug 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest reducing the number of stages here to make the process seem less daunting and more accurately describe the community aspect

I also think this is not quite how I have seen the current proposals work.

Specifically, I suggest NOT saying the PMC has a gatekeeping role for deciding if the community can try to implement proposals -- as this doesn't reflect my experience. Instead I think the the PMC should keep the formal gatekeeping activity restricted to the final approved modification to the spec. This will reduce the administrative burden on both proposals and the PMC.

Suggested change
Discuss -> Draft -> POC -> Approved -> Implementation -> Release
Discuss -> Draft / POC -> Implementations -> Approval

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done
I also updated the wording and clarified transition to remove any hint of gatekeeping (except for final approval for release)

You can attach a google doc to collaborate on the general idea with the community.

### Draft
Once the discussion has stabilized and you are ready to start a POC, open a PR to add a new Markdown file in the proposals folder and give more visibility to the work in progress.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps you can link to some examples, such as

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What purpose does the separate mark-down doc intended to serve? Per @alamb 's point above we haven't had ones for recent changes?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general this seems fine to me as it probably helps developers do better archeology.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was my idea.
We can also just make this a table that points to github issues given a pre-defined template?
That might be easier.
What do you think?

Comment on lines 19 to 22
### POC
The proposal document can evolve along the course of the POC. In particular to add more links to findings and performance evaluations. Collaboration is encouraged. More validation on the POC increases the chances of success.

Make sure you consider the [#Requirements] to ensure the success of the POC.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend combining the Draft / POC stage -- it seems like any change also has a POC that shows what the proposal looks like in code so it makes more sense to me if they are combined

Comment on lines 24 to 28
### Approved for Implementation
Once the POC has concluded, we should have a clear idea of whether we want to pursue the implementation accross the ecosystem. A PMC vote will formalize that stage

### Implementation
At this stage we need to meet the contribution guidelines to confsider the implementation finished (ex: two independent implementations with cross compatibility tests, spec updated, ...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend combining "Approved for Implementation" and "Implementation" -- and highlighting that at this stage you will have convinced someone to implement the proposal in at least one other open source implementation and that showing at least those two implementations are interoperable

### Release
Once the implementation phase is finished, we can include the contribution in the next release.

## Active Proposals
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a nice way to list / find the active proposal

Status: DRAFT|POC|ACCEPTED|COMPLETED
---

## Description
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend also adding a "## Rationale" section:

Describe why this is a feature that will improve the parquet format and what alternatives currently exist for the usecase (e.g. must use a different format, or "must build additional infrastructure to avoid re-parsing footer on each query", or "must use a general purpose compression algorithm to achieve the same space, thus slowing down query performance)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added!

Once the POC has concluded, we should have a clear idea of whether we want to pursue the implementation accross the ecosystem. A PMC vote will formalize that stage

### Implementation
At this stage we need to meet the contribution guidelines to confsider the implementation finished (ex: two independent implementations with cross compatibility tests, spec updated, ...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
At this stage we need to meet the contribution guidelines to confsider the implementation finished (ex: two independent implementations with cross compatibility tests, spec updated, ...)
At this stage we need to meet the contribution guidelines to consider the implementation finished (ex: two independent implementations with cross compatibility tests, spec updated, ...)

integrated review feedback
integrated review feedback
add review feedback
@julienledem
Copy link
Member Author

Thank you @alamb and @emkornfield for the feedback. I have integrated your excellent points in the proposal.
One question on whether the additional markdown doc is worth it or simply having a template for a github issue would be sufficient. And then listing the issues in the table bellow.
#513 (comment)

@julienledem julienledem requested a review from rdblue August 29, 2025 22:55
@julienledem julienledem self-assigned this Aug 29, 2025
@julienledem julienledem marked this pull request as ready for review August 29, 2025 22:55
# Proposals

This proposal process is intended for significantly impactfull additions to the Parquet spec. The goal is to facilitate those projects and help them being contributed to Parquet.
For example, changes that are not backward compatible like adding a new encoding or a new compression algorithm (older implementations can not read new files).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed


### Draft/POC
Once you feel the discussion has stabilized and you are ready to start a POC, open a PR to add a new Markdown file in the proposals folder and give more visibility to the work in progress.
The proposal document can evolve along the course of the POC. In particular to add more links to findings and performance evaluations. Collaboration is encouraged. More validation on the POC increases the chances of success.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I worry a little bit about the friction of contributors keeping the markdown up-to-date? Maybe keeping this in the google doc? But we can see how it works in practice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed the wording to remove this as a requirement. I have left it as an option if people want to have a perennial place to save documents.

Copy link
Contributor

@emkornfield emkornfield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proposal looks reasonable, I think keeping things in a good doc until we are ready to finalize one way or another might be less friction but we can assess on the next project.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me -- it is nicely described and reflects what I understand to be the reality of how this works

julienledem and others added 2 commits September 25, 2025 11:32
Update proposals/README.md

Co-authored-by: Andrew Lamb <[email protected]>
Signed-off-by: Julien Le Dem <[email protected]>
@julienledem julienledem merged commit 9fd57b5 into master Sep 25, 2025
8 checks passed
jiayuasu pushed a commit to jiayuasu/parquet-format that referenced this pull request Oct 6, 2025
* start with a simple process and an example

---------

Signed-off-by: Julien Le Dem <[email protected]>
Co-authored-by: Andrew Lamb <[email protected]>
Co-authored-by: Micah Kornfield
Co-authored-by: Marc Cenac <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants