-
Notifications
You must be signed in to change notification settings - Fork 467
PARQUET-2249: Introduce IEEE 754 total order #221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
JFinis
wants to merge
1
commit into
apache:master
Choose a base branch
from
JFinis:totalorder
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -1030,6 +1030,9 @@ struct RowGroup { | |||||||||||
| /** Empty struct to signal the order defined by the physical or logical type */ | ||||||||||||
| struct TypeDefinedOrder {} | ||||||||||||
|
|
||||||||||||
| /** Empty struct to signal IEEE 754 total order for floating point types */ | ||||||||||||
| struct IEEE754TotalOrder {} | ||||||||||||
|
|
||||||||||||
| /** | ||||||||||||
| * Union to specify the order used for the min_value and max_value fields for a | ||||||||||||
| * column. This union takes the role of an enhanced enum that allows rich | ||||||||||||
|
|
@@ -1038,6 +1041,7 @@ struct TypeDefinedOrder {} | |||||||||||
| * Possible values are: | ||||||||||||
| * * TypeDefinedOrder - the column uses the order defined by its logical or | ||||||||||||
| * physical type (if there is no logical type). | ||||||||||||
| * * IEEE754TotalOrder - the floating point column uses IEEE 754 total order. | ||||||||||||
| * | ||||||||||||
| * If the reader does not support the value of this union, min and max stats | ||||||||||||
| * for this column should be ignored. | ||||||||||||
|
|
@@ -1082,8 +1086,12 @@ union ColumnOrder { | |||||||||||
| * BYTE_ARRAY - unsigned byte-wise comparison | ||||||||||||
| * FIXED_LEN_BYTE_ARRAY - unsigned byte-wise comparison | ||||||||||||
| * | ||||||||||||
| * (*) Because the sorting order is not specified properly for floating | ||||||||||||
| * point values (relations vs. total ordering) the following | ||||||||||||
| * (*) Because the precise sorting order is ambiguous for floating | ||||||||||||
| * point types due to underspecified handling of NaN and -0/+0, | ||||||||||||
| * it is recommended that writers use IEEE_754_TOTAL_ORDER | ||||||||||||
| * for these types. | ||||||||||||
| * | ||||||||||||
| * If this ordering is used for floating point types, then the following | ||||||||||||
| * compatibility rules should be applied when reading statistics: | ||||||||||||
| * - If the min is a NaN, it should be ignored. | ||||||||||||
| * - If the max is a NaN, it should be ignored. | ||||||||||||
|
|
@@ -1099,6 +1107,53 @@ union ColumnOrder { | |||||||||||
| * `-0.0` should be written into the min statistics field. | ||||||||||||
| */ | ||||||||||||
| 1: TypeDefinedOrder TYPE_ORDER; | ||||||||||||
|
|
||||||||||||
| /* | ||||||||||||
| * The floating point type is ordered according to the totalOrder predicate, | ||||||||||||
| * as defined in section 5.10 of IEEE-754 (2008 revision). Only columns of | ||||||||||||
| * physical type FLOAT or DOUBLE, or logical type FLOAT16 may use this ordering. | ||||||||||||
| * | ||||||||||||
| * Intuitively, this orders floats mathematically, but defines -0 to be less | ||||||||||||
| * than +0, -NaN to be less than anything else, and +NaN to be greater than | ||||||||||||
| * anything else. It also defines an order between different bit representations | ||||||||||||
| * of the same value. | ||||||||||||
| * | ||||||||||||
| * The formal definition is as follows: | ||||||||||||
| * a) If x<y, totalOrder(x, y) is true. | ||||||||||||
| * b) If x>y, totalOrder(x, y) is false. | ||||||||||||
| * c) If x=y: | ||||||||||||
| * 1) totalOrder(−0, +0) is true. | ||||||||||||
| * 2) totalOrder(+0, −0) is false. | ||||||||||||
| * 3) If x and y represent the same floating-point datum: | ||||||||||||
| * i) If x and y have negative sign, totalOrder(x, y) is true if and | ||||||||||||
| * only if the exponent of x ≥ the exponent of y | ||||||||||||
| * ii) otherwise totalOrder(x, y) is true if and only if the exponent | ||||||||||||
| * of x ≤ the exponent of y. | ||||||||||||
| * d) If x and y are unordered numerically because x or y is NaN: | ||||||||||||
| * 1) totalOrder(−NaN, y) is true where −NaN represents a NaN with | ||||||||||||
| * negative sign bit and y is a non-NaN floating-point number. | ||||||||||||
| * 2) totalOrder(x, +NaN) is true where +NaN represents a NaN with | ||||||||||||
| * positive sign bit and x is a non-NaN floating-point number. | ||||||||||||
| * 3) If x and y are both NaNs, then totalOrder reflects a total ordering | ||||||||||||
| * based on: | ||||||||||||
| * i) negative sign orders below positive sign | ||||||||||||
| * ii) signaling orders below quiet for +NaN, reverse for −NaN | ||||||||||||
| * iii) lesser payload, when regarded as an integer, orders below | ||||||||||||
| * greater payload for +NaN, reverse for −NaN. | ||||||||||||
| * | ||||||||||||
| * Note that this ordering can be implemented efficiently in software by bit-wise | ||||||||||||
| * operations on the integer representation of the floating point values. | ||||||||||||
| * E.g., this is a possible implementation for DOUBLE in Rust: | ||||||||||||
| * | ||||||||||||
| * pub fn totalOrder(x: f64, y: f64) -> bool { | ||||||||||||
wgtmac marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||||||
| * let mut x_int = x.to_bits() as i64; | ||||||||||||
| * let mut y_int = y.to_bits() as i64; | ||||||||||||
| * x_int ^= (((x_int >> 63) as u64) >> 1) as i64; | ||||||||||||
| * y_int ^= (((y_int >> 63) as u64) >> 1) as i64; | ||||||||||||
|
Comment on lines
+1151
to
+1152
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's add a comment explaining those lines?
Suggested change
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's assuming the comment is right. Please double-check :) |
||||||||||||
| * return x_int <= y_int; | ||||||||||||
| * } | ||||||||||||
| */ | ||||||||||||
| 2: IEEE754TotalOrder IEEE_754_TOTAL_ORDER; | ||||||||||||
| } | ||||||||||||
|
|
||||||||||||
| struct PageLocation { | ||||||||||||
|
|
||||||||||||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.