Skip to content

clCopyImage: improve performance by removing redundant copies#2721

Closed
rjodinchr wants to merge 3 commits into
KhronosGroup:mainfrom
rjodinchr:copy-images
Closed

clCopyImage: improve performance by removing redundant copies#2721
rjodinchr wants to merge 3 commits into
KhronosGroup:mainfrom
rjodinchr:copy-images

Conversation

@rjodinchr

Copy link
Copy Markdown
Collaborator

This patch restructures the image copy conformance tests to eliminate unnecessary
host-side memory allocations and data duplication, significantly improving the
execution speed of the test loops.

Key changes include:

  • Removed the redundant srcHost buffer allocation and its associated memcpy
    operation. The test loop now directly uses srcData for validation.
  • Split the monolithic test_copy_image_generic into test_copy_init_images
    (for setup) and test_copy_image_generic (for execution). This hoists
    buffer sizing, random data generation, and image creation out of the inner
    testing loops.
  • Extracted memory mapping and initialization logic from create_image into a
    standalone init_image function in common.cpp. This allows destination
    images to be efficiently reset during test iterations without fully
    recreating the OpenCL memory objects.
  • Consolidated scattered extern function declarations across all dimension
    tests into a new test_copy_generic.h header.
  • Fixed a minor typo in test_copy_1D_buffer.cpp where srcImageInfo was
    incorrectly passed as the destination descriptor.

@rjodinchr

Copy link
Copy Markdown
Collaborator Author

Depends on #2706 & #2696

@rjodinchr rjodinchr marked this pull request as draft June 11, 2026 06:47
The primary goal of this commit is to improve CI tracking by
introducing a new golden format that can differentiate test results
based on command-line arguments. To cleanly extract and pass these
arguments into the JSON result outputs, the command-line parsing
infrastructure across the CTS required a significant refactoring.

Key changes include:
* Enhanced CI Tracking: Updates `ci/compare_results.py`,
`ci/pocl/golden.json`, and `saveResultsToJson` to include and evaluate
an `args` key. The golden JSON now uses a nested format mapping
specific argument strings (e.g., `--wimpy -1`) to their expected
results, allowing the CI to validate the same binary run under
different parameters.
* Centralized Parsing Infrastructure: Introduces the `ParseArgsFn`
callback and `runTestHarnessWithCheckAndParse`. This offloads custom
argument parsing from individual test `main()` functions and safely
extracts the arguments used so they can be logged by the test harness.
* Help Text Consolidation: Replaces fragmented `printUsage()`
functions with unified `help` string references populated directly by
the standard parsing callbacks.

[run-test: test_computeinfo]
[run-test: test_bruteforce -1 -w]
[run-test: test_cl_copy_images small_images --num-worker-threads 2 1D]
[run-test: test_image_streams 1D --num-worker-threads 2 CL_R CL_FILTER_NEAREST]
This patch removes the use of global state across the image
conformance tests and replaces it with a passed `context_t` structure.

The goal of this change is to be able to run `image_streams` in
parallel by removing global variables as they were not
thread-safe. This refactoring had to be applied everywhere because
`image_streams` needs to update `test_format_set_fn`, which is used by
all the image tests.

Key changes:
* Removed global flags such as `gTestSmallImages`, `gTestMipmaps`, `gDebugTrace`, `gEnablePitch`, etc.
* Initialized a `context_t` struct in the `main.cpp` files to store test configuration.
* Updated test function signatures across `clCopyImage`, `clFillImage`, `clGetInfo`, and `kernel_read_write` to accept `const context_t &ctx`.

[run-test: test_image_streams 1D --num-worker-threads 2 CL_A CL_FILTER_NEAREST]
[run-test: test_cl_copy_images small_images --num-worker-threads 2 1D]
This patch restructures the image copy conformance tests to eliminate unnecessary
host-side memory allocations and data duplication, significantly improving the
execution speed of the test loops.

Key changes include:
* Removed the redundant `srcHost` buffer allocation and its associated `memcpy`
  operation. The test loop now directly uses `srcData` for validation.
* Split the monolithic `test_copy_image_generic` into `test_copy_init_images`
  (for setup) and `test_copy_image_generic` (for execution). This hoists
  buffer sizing, random data generation, and image creation out of the inner
  testing loops.
* Extracted memory mapping and initialization logic from `create_image` into a
  standalone `init_image` function in `common.cpp`. This allows destination
  images to be efficiently reset during test iterations without fully
  recreating the OpenCL memory objects.
* Consolidated scattered `extern` function declarations across all dimension
  tests into a new `test_copy_generic.h` header.
* Fixed a minor typo in `test_copy_1D_buffer.cpp` where `srcImageInfo` was
  incorrectly passed as the destination descriptor.
@rjodinchr

Copy link
Copy Markdown
Collaborator Author

I'll reopen once/if the dependencies get merged

@rjodinchr rjodinchr closed this Jun 18, 2026
@rjodinchr rjodinchr deleted the copy-images branch June 18, 2026 13:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant