Skip to content

Conversation

@misiugodfrey
Copy link
Contributor

@misiugodfrey misiugodfrey commented Sep 24, 2025

A lot of our scripts use almost identical format for parsing. Notably a case statement with the form:

--<option>)
        if [[ -n "${2:-}" ]]; then
          <option>="$2"
          shift 2
        else
          echo "ERROR: --<option> requires true or false argument" >&2
          exit 1
        fi
        ;;

I decided to take a quick shot at outlining and refactoring this code to make adding argument parsing more generic.

The process now is that each file will declare it's own arrays of options and flags (options being flags that take additional parameters and flags needing none):

declare -A OPTION_MAP=( ["-b"]="--build-target" ["-j"]="--num-threads" )

The common parse_args() functions (or custom ones if custom handling is required) will use these maps to parse all options and assign the values to environment variables with the same name.

I'm curious what folks think about this approach.

;;
-h|--help) print_help; exit 0;;
$OPTIONS) parse_option $1 $2; shift 2;;
-c|--convert-decimals-to-floats)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left this one in since it sets it's variable to a custom value, rather than "TRUE"

}

parse_args "$@"
custom_parse_args "$@"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved argument parsing before waiting on worker node so the user can use functions like "--help" without having to wait for a worker registration.


set -e

source ./start_presto_helper_parse_args.sh
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved this up so the user can use "--help" before running into the ENV variable validation.


parse_args() {
source ../../scripts/helper_functions.sh
declare -A OPTION_MAP=( ["-b"]="--build-target" ["-j"]="--num-threads" )
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed --build to --build-target to match up with the variable being set.

exit 1
;;
-h|--help) print_help; exit 0 ;;
-n|--no-cache) SKIP_CACHE_ARG="--no-cache"; shift ;;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left custom in.

custom_parse_args "$@"

if [[ -z $SCRIPT_PATH ]]; then
if [[ -z $PYTHON_SCRIPT_PATH ]]; then
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated variable name to match option.

@mattgara
Copy link
Contributor

@misiugodfrey This looks great, thanks for doing this refactor.

I think there may be an ideal time to do this where there are not many in-flight PRs with scripts that would directly conflict with this change. Right now, this will probably conflict with a lot of in-flight PRs.

@misiugodfrey
Copy link
Contributor Author

I think there may be an ideal time to do this where there are not many in-flight PRs with scripts that would directly conflict with this change. Right now, this will probably conflict with a lot of in-flight PRs.

I'm fine with holding off on this until more things have merged, as it's not a semantic change. That being said, now's a good time for conceptual feedback so I can apply any requested changes before I inevitably need to resolve conflicts with all the other landing PRs.

@simoneves
Copy link
Contributor

@misiugodfrey updated thoughts on this one? Are we any closer to a point where it could go in, or have we proliferated individual script implementations even more. I think we might have... :(

@simoneves simoneves added the improvement Improves an existing functionality label Oct 26, 2025
@simoneves
Copy link
Contributor

Another good utility function to add... something to validate that a script is being run from the correct location (most likely its own directory)... similar to https://github.com/rapidsai/velox-testing/blob/main/presto/scripts/generate_presto_config.sh#L32

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improves an existing functionality

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants