-
Notifications
You must be signed in to change notification settings - Fork 4
Presto: Separate Hive configs for Coordinator & Worker #121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| - ./config/generated/java/etc_common:/opt/presto-server/etc | ||
| - ./config/generated/java/etc_coordinator/config_java.properties:/opt/presto-server/etc/config.properties | ||
| - ./config/generated/java/etc_coordinator/node.properties:/opt/presto-server/etc/node.properties | ||
| - ./config/generated/java/etc_coordinator/catalog/hive.properties:/opt/presto-server/etc/catalog/hive.properties |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the updates be in docker-compose.common.yml?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They can't be because configs are now per-variant
| parquet.reader.chunk-read-limit=0 | ||
| parquet.reader.pass-read-limit=0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These configurations do not appear in the documentation. Can you please add comments that describe what these parameters do and why they are needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They aren't needed, other than to prove a point that the Coordinator and Worker configs can be different, but apparently @devavret tweaks them on his local laptop system, so asked for them to be exposed.
The values are not documented in Velox itself, but appear to be passed to the cuDF Chunked Parquet Reader, and that documentation is here:
The values in that API are in bytes, but it appears that the config parser is smart enough to convert (say) 16M into (16 * 1024 * 1024).
I have added comments to the template file based on the parameter descriptions in that documentation.
There are additional Hive config options which can be used in the Worker but if they are also in the Coordinator, it throws an error.
For now, we just add some Parquet read parameters at their defaults, and is therefore just a convenience for users who wish to tweak these for some specific machine.
IMPORTANT!
You will need to use
--overwrite-configwhen first running with this PR, otherwise it will reuse the existing tree, and there will be ahive.propertiesfile in the old location which will throw a startup error in the Docker mappings. Be sure to copy any edited files aside before doing this, of course, or they will be lost.