Skip to content

ci: Update result tracking with new golden format#2706

Open
rjodinchr wants to merge 1 commit into
KhronosGroup:mainfrom
rjodinchr:ci
Open

ci: Update result tracking with new golden format#2706
rjodinchr wants to merge 1 commit into
KhronosGroup:mainfrom
rjodinchr:ci

Conversation

@rjodinchr

@rjodinchr rjodinchr commented May 27, 2026

Copy link
Copy Markdown
Collaborator

The primary goal of this commit is to improve CI tracking by
introducing a new golden format that can differentiate test results
based on command-line arguments. To cleanly extract and pass these
arguments into the JSON result outputs, the command-line parsing
infrastructure across the CTS required a significant refactoring.

Key changes include:

  • Enhanced CI Tracking: Updates ci/compare_results.py,
    ci/pocl/golden.json, and saveResultsToJson to include and evaluate
    an args key. The golden JSON now uses a nested format mapping
    specific argument strings (e.g., --wimpy -1) to their expected
    results, allowing the CI to validate the same binary run under
    different parameters.
  • Centralized Parsing Infrastructure: Introduces the ParseArgsFn
    callback and runTestHarnessWithCheckAndParse. This offloads custom
    argument parsing from individual test main() functions and safely
    extracts the arguments used so they can be logged by the test harness.
  • Help Text Consolidation: Replaces fragmented printUsage()
    functions with unified help string references populated directly by
    the standard parsing callbacks.

The primary goal of this commit is to improve CI tracking by
introducing a new golden format that can differentiate test results
based on command-line arguments. To cleanly extract and pass these
arguments into the JSON result outputs, the command-line parsing
infrastructure across the CTS required a significant refactoring.

Key changes include:
* Enhanced CI Tracking: Updates `ci/compare_results.py`,
`ci/pocl/golden.json`, and `saveResultsToJson` to include and evaluate
an `args` key. The golden JSON now uses a nested format mapping
specific argument strings (e.g., `--wimpy -1`) to their expected
results, allowing the CI to validate the same binary run under
different parameters.
* Centralized Parsing Infrastructure: Introduces the `ParseArgsFn`
callback and `runTestHarnessWithCheckAndParse`. This offloads custom
argument parsing from individual test `main()` functions and safely
extracts the arguments used so they can be logged by the test harness.
* Help Text Consolidation: Replaces fragmented `printUsage()`
functions with unified `help` string references populated directly by
the standard parsing callbacks.

[run-test: test_computeinfo]
[run-test: test_bruteforce -1 -w]
[run-test: test_cl_copy_images small_images --num-worker-threads 2 1D]
[run-test: test_image_streams 1D --num-worker-threads 2 CL_R CL_FILTER_NEAREST]
@rjodinchr

Copy link
Copy Markdown
Collaborator Author

ref #2723

@ahesham-arm

Copy link
Copy Markdown
Collaborator

Hi @rjodinchr, is it possible to separate the work into multiple commits, e.g. one for each of the bullet points you listed in the description. There is a lot of value in your recent PRs but they can be exhuasting to review because of the volume of unrelated/non-core changes required.

@rjodinchr

Copy link
Copy Markdown
Collaborator Author

Hi @rjodinchr, is it possible to separate the work into multiple commits, e.g. one for each of the bullet points you listed in the description. There is a lot of value in your recent PRs but they can be exhuasting to review because of the volume of unrelated/non-core changes required.

I understand the difficulty of reviewing such a large PR.
It feels very complicated to break this one down into smaller pieces. Those commits would either not be able to compile or would add unused code.

I can suggest the following strategy for the review:

  • First, let's focus on the test_common/harness part.
  • Once we agree on that, each individual test makes use of the new harness changes. They are all independent but heavily depend on the harness changes.
  • We can have a look at the ci part at the end.

The harness which is the core of this PR should not be too big to start with:

4 files changed, 144 insertions(+), 72 deletions(-)

else if (!strcmp(argv[i], "--wimpy") || !strcmp(argv[i], "-w"))
{
delArg++;
removed_args.push_back("--wimpy");

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this the only argument that gets pushed back as a hardcoded string and not argv[i]?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ensures consistency, as we don't need both -w and --wimpy to be reported. It forces all reports to use a single format so that two identical runs aren't mistaken for being different just because they used different naming conventions.

if (!help)
{
help = true;
removed_args.push_back("--help");

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this one too I guess.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as for wimpy, even so I believe it is less of a problem with the help argument.

cl_command_queue_properties queueProps,
DeviceCheckFn deviceCheckFn);

typedef test_status (*ParseArgsFn)(int &argc, const char *argv[],

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personal perference but I think using is easier to read for functions, i.e.

using ParseArgsFn = test_status (*)(
    int &argc,
    const char *argv[],
    std::vector<std::string> &removed_args,
    std::string &help_description
);

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm indifferent. clang-format should be trusted to handle repository requirements. If multiple formats are permitted, we should either configure clang-format to enforce a specific one or simply accept whatever clang-format allows.

{
log_info("\n");
log_info("**************************\n");
log_info("*** !! WARNING !! ***\n");

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: this is different than the existing message. I don't mind it, just making sure this was intentional.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is intentional. Some tests trigger an additional message when wimpy mode is enabled. I wanted to ensure that any tests not printing an additional message can rely on that message to explicitly signal the warning part that most other messages display.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants