Skip to content

Conversation

@johallar
Copy link
Contributor

@johallar johallar commented Oct 9, 2025

Per discussion here:
https://docs.google.com/document/d/1wD9YDDze5vr3K6kpU4eKwlk55GwvFx9vVTTj_X8nj-w/edit?tab=t.0

Adding properties to behcnmark_result.json to contain:

hardware": {
    "arch": "x86_64",
    "cpu_count": 32,
    "cpu_name": "intel",
    "cpu_memory": 220,
    "gpu_count": 6,
    "gpu_name": "t4",
    "gpu_memory": 15
},
"versions": {
    "version_presto": "b8bb0c888739f337a5e7f80df9b2ee6aeb2286f9",
    "version_velox": "dca16b16d8455630e662716824c7d9afc3af8b7a",
    "version_cuda": "12.2",
    "version_cuda_driver": "535.230.02"
},

@johallar johallar force-pushed the johallar/add_system_info_benchmark_output branch from 2668ecd to a325bee Compare October 9, 2025 15:36
@simoneves simoneves self-requested a review October 18, 2025 01:12
Copy link
Contributor

@simoneves simoneves left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation looks fine. Unclear on context.

benchmark_output

# Generated Config
presto/docker/config/generated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, but drop this and I'll add it to my HERC-82 PR

"""Get total system memory in GB."""
try:
# Read from /proc/meminfo (Linux)
with open("/proc/meminfo", "r") as f:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an ongoing discussion as to the best way to get this value that works reliably on all machines. Take a look at PR #48 for more notes.

check=True
)
# Parse CUDA version from output (typically in header)
# Example: "CUDA Version: 12.2"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is useful info, but it's also implicit from the driver version.

@johallar
Copy link
Contributor Author

@paul-aiyedun @simoneves I forgot about this PR. Do you think it's worth getting something like this in, understanding that the assumption is you're running the script on the same machine you're running benchmarks on? Or is this metadata worthless unless we update the prestodb API to include it and get it from there?

@simoneves
Copy link
Contributor

I am still unclear on the context of this, and indeed the process you intended it to be inserted into may have changed.

@paul-aiyedun and @misiugodfrey will know more.

I would re-base it or merge-update it to current main and we'll go from there.

@johallar
Copy link
Contributor Author

@simoneves High level context is it's going to be necessary (or at least very beneficial) to capture metadata about the benchmark run if we intend to do any visualization or data analysis with the results themselves afterwards. The alternative seems like a manually collated spreadsheet like has been done up until this point.

This branch is old and will need rebasing, but just checking to make sure i'm not barking up the wrong tree here before proceeding. Updating the presto API seems like a major project that makes big changes to Presto's API as far as I can tell, so maybe this would still be useful in the meantime.

@simoneves
Copy link
Contributor

I understand the requirement, but I do not feel qualified to review the change. This is all @paul-aiyedun's code and I am not familiar with it at all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants