clCopyImage: improve performance by removing redundant copies#2721
Closed
rjodinchr wants to merge 3 commits into
Closed
clCopyImage: improve performance by removing redundant copies#2721rjodinchr wants to merge 3 commits into
rjodinchr wants to merge 3 commits into
Conversation
Collaborator
Author
The primary goal of this commit is to improve CI tracking by introducing a new golden format that can differentiate test results based on command-line arguments. To cleanly extract and pass these arguments into the JSON result outputs, the command-line parsing infrastructure across the CTS required a significant refactoring. Key changes include: * Enhanced CI Tracking: Updates `ci/compare_results.py`, `ci/pocl/golden.json`, and `saveResultsToJson` to include and evaluate an `args` key. The golden JSON now uses a nested format mapping specific argument strings (e.g., `--wimpy -1`) to their expected results, allowing the CI to validate the same binary run under different parameters. * Centralized Parsing Infrastructure: Introduces the `ParseArgsFn` callback and `runTestHarnessWithCheckAndParse`. This offloads custom argument parsing from individual test `main()` functions and safely extracts the arguments used so they can be logged by the test harness. * Help Text Consolidation: Replaces fragmented `printUsage()` functions with unified `help` string references populated directly by the standard parsing callbacks. [run-test: test_computeinfo] [run-test: test_bruteforce -1 -w] [run-test: test_cl_copy_images small_images --num-worker-threads 2 1D] [run-test: test_image_streams 1D --num-worker-threads 2 CL_R CL_FILTER_NEAREST]
This patch removes the use of global state across the image conformance tests and replaces it with a passed `context_t` structure. The goal of this change is to be able to run `image_streams` in parallel by removing global variables as they were not thread-safe. This refactoring had to be applied everywhere because `image_streams` needs to update `test_format_set_fn`, which is used by all the image tests. Key changes: * Removed global flags such as `gTestSmallImages`, `gTestMipmaps`, `gDebugTrace`, `gEnablePitch`, etc. * Initialized a `context_t` struct in the `main.cpp` files to store test configuration. * Updated test function signatures across `clCopyImage`, `clFillImage`, `clGetInfo`, and `kernel_read_write` to accept `const context_t &ctx`. [run-test: test_image_streams 1D --num-worker-threads 2 CL_A CL_FILTER_NEAREST] [run-test: test_cl_copy_images small_images --num-worker-threads 2 1D]
This patch restructures the image copy conformance tests to eliminate unnecessary host-side memory allocations and data duplication, significantly improving the execution speed of the test loops. Key changes include: * Removed the redundant `srcHost` buffer allocation and its associated `memcpy` operation. The test loop now directly uses `srcData` for validation. * Split the monolithic `test_copy_image_generic` into `test_copy_init_images` (for setup) and `test_copy_image_generic` (for execution). This hoists buffer sizing, random data generation, and image creation out of the inner testing loops. * Extracted memory mapping and initialization logic from `create_image` into a standalone `init_image` function in `common.cpp`. This allows destination images to be efficiently reset during test iterations without fully recreating the OpenCL memory objects. * Consolidated scattered `extern` function declarations across all dimension tests into a new `test_copy_generic.h` header. * Fixed a minor typo in `test_copy_1D_buffer.cpp` where `srcImageInfo` was incorrectly passed as the destination descriptor.
Collaborator
Author
|
I'll reopen once/if the dependencies get merged |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This patch restructures the image copy conformance tests to eliminate unnecessary
host-side memory allocations and data duplication, significantly improving the
execution speed of the test loops.
Key changes include:
srcHostbuffer allocation and its associatedmemcpyoperation. The test loop now directly uses
srcDatafor validation.test_copy_image_genericintotest_copy_init_images(for setup) and
test_copy_image_generic(for execution). This hoistsbuffer sizing, random data generation, and image creation out of the inner
testing loops.
create_imageinto astandalone
init_imagefunction incommon.cpp. This allows destinationimages to be efficiently reset during test iterations without fully
recreating the OpenCL memory objects.
externfunction declarations across all dimensiontests into a new
test_copy_generic.hheader.test_copy_1D_buffer.cppwheresrcImageInfowasincorrectly passed as the destination descriptor.