Skip to content

Commit 336dd4f

Browse files
Add skill evaluation dataset for cuopt-installation-developer (#1169)
## Summary Skill evaluation dataset for `cuopt-installation-developer` at `skills/cuopt-installation-developer/evals/evals.json`. 10 entries focused on the build-from-source path. Companion to #1167 (`cuopt-developer` evals). ## Coverage | Theme | Count | |---|---| | First-time build / required questions | 3 | | CUDA driver / env selection | 2 | | Major-version mismatch diagnosis & clean build | 2 | | User vs developer install boundary | 1 | | Hand-off to cuopt-developer | 1 | | Refusal (sudo conda) | 1 | Same schema as #1167 (`id`, `question`, `expected_skill`, `expected_script`, `ground_truth`, `expected_behavior`). Schema check and pre-commit hooks pass locally. Authors: - Ramakrishnap (https://github.com/rgsl888prabhu) Approvers: - Trevor McKay (https://github.com/tmckayus) URL: #1169
1 parent 5b90fb3 commit 336dd4f

1 file changed

Lines changed: 145 additions & 0 deletions

File tree

  • skills/cuopt-installation-developer/evals
Lines changed: 145 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,145 @@
1+
[
2+
{
3+
"id": "inst-001-first-time-build",
4+
"question": "I'm cloning cuOpt for the first time and I want to build it from source. Walk me through what I need.",
5+
"expected_skill": "cuopt-installation-developer",
6+
"expected_script": null,
7+
"ground_truth": "Before any build commands, the agent walks through environment prerequisites by asking the standard questions: OS (Linux is supported), the GPU driver and its maximum supported CUDA version (via nvidia-smi), the goal (upstream contribution vs local fork/modification), and the target component (C++/CUDA core, Python bindings, server, docs, CI). The conceptual setup is: clone the repo (and submodules if any), select a conda env from conda/environments/all_cuda-<ver>_arch-<arch>.yaml whose CUDA major is at most the driver's max CUDA major, create and activate that env, run ./build.sh, then run tests (pytest / ctest). The agent points to the repo's own CONTRIBUTING.md and conda/environments/ as the canonical command source rather than naming exact versions. Once the build works, the agent suggests switching to cuopt-developer for contribution behavior, DCO sign-off, and PR workflow.",
8+
"expected_behavior": [
9+
"Asks about OS, GPU driver max CUDA version, goal, and target component before issuing commands",
10+
"Mentions cloning the repo (and submodules where applicable)",
11+
"Mentions selecting a conda env from conda/environments/ matched to the driver's CUDA major",
12+
"Mentions creating and activating the conda env before building",
13+
"Names ./build.sh as the build entry point and mentions running tests after",
14+
"References CONTRIBUTING.md / repo docs as the canonical source for exact commands",
15+
"Suggests switching to cuopt-developer once the build works and the user is contributing"
16+
]
17+
},
18+
{
19+
"id": "inst-002-cuda-driver-check",
20+
"question": "How do I know which conda env file to pick from conda/environments/?",
21+
"expected_skill": "cuopt-installation-developer",
22+
"expected_script": null,
23+
"ground_truth": "The agent tells the user to query the GPU driver's maximum supported CUDA version with nvidia-smi (top-right 'CUDA Version' field) and note the major version. Then list the available env files (ls conda/environments/all_cuda-*_arch-$(uname -m).yaml) — each filename encodes the CUDA version and architecture. Pick one whose CUDA major is at most the driver's max CUDA major. Minor mismatch within the same major is supported (CUDA guarantees minor compatibility); a major mismatch builds successfully but fails at runtime in RMM with a cudaMallocAsync error. The agent does not pick an env without first checking the driver.",
24+
"expected_behavior": [
25+
"Tells the user to run nvidia-smi and read the top-right 'CUDA Version' field",
26+
"Mentions noting the major version of the driver's max CUDA",
27+
"Mentions listing conda/environments/all_cuda-*_arch-$(uname -m).yaml to see what is available",
28+
"Mentions selecting an env whose CUDA major is at most the driver's CUDA major",
29+
"Mentions minor compatibility within the same major is supported",
30+
"Warns that a major mismatch builds but fails at runtime in RMM",
31+
"Does not name a specific env without first checking the driver"
32+
]
33+
},
34+
{
35+
"id": "inst-003-cuda-major-mismatch-diagnosis",
36+
"question": "My build succeeded, but when I run tests I get 'RMM failure ... cudaMallocAsync not supported with this CUDA driver/runtime version'. What happened?",
37+
"expected_skill": "cuopt-installation-developer",
38+
"expected_script": null,
39+
"ground_truth": "This is the classic CUDA major-version mismatch. The conda env's CUDA toolkit is a newer major than the GPU driver supports. The build succeeds because compilation is independent of runtime; the failure surfaces at runtime when RMM tries to use cudaMallocAsync from a CUDA major the driver does not support. The fix: check the driver's max CUDA via nvidia-smi, choose a conda env from conda/environments/ whose CUDA major is at most the driver's, run ./build.sh clean (or otherwise wipe build artifacts), then rebuild against the new env. Cached build artifacts must not be reused across CUDA major versions.",
40+
"expected_behavior": [
41+
"Identifies the symptom as a CUDA major-version mismatch (env toolkit newer than driver supports)",
42+
"Explains build succeeds but runtime fails (compile-vs-runtime separation)",
43+
"Tells the user to check nvidia-smi and select a compatible CUDA major env",
44+
"Mentions ./build.sh clean (or wiping build artifacts) before rebuilding",
45+
"States cached artifacts must not be reused across CUDA major versions"
46+
]
47+
},
48+
{
49+
"id": "inst-004-required-questions",
50+
"question": "I want to start contributing to cuOpt. What do I need to know up front before setting up?",
51+
"expected_skill": "cuopt-installation-developer",
52+
"expected_script": null,
53+
"ground_truth": "Before prescribing commands, the agent asks: which OS (Linux is supported); what CUDA major version the GPU driver supports (run nvidia-smi to check); whether this is for upstream contribution or a local fork/modification (contribution requires DCO sign-off and the fork-based PR workflow, covered by cuopt-developer); and which component is being targeted (C++/CUDA core, Python bindings, server, docs, CI). The agent points to CONTRIBUTING.md and the conda/environments/ files as the canonical sources for exact versions and commands.",
54+
"expected_behavior": [
55+
"Asks about OS",
56+
"Asks about GPU driver and its max supported CUDA major (via nvidia-smi)",
57+
"Asks whether this is upstream contribution or local modification",
58+
"Asks about the target component (C++/CUDA, Python, server, docs, CI)",
59+
"References CONTRIBUTING.md as the canonical command source",
60+
"Does not run install commands without explicit user approval"
61+
]
62+
},
63+
{
64+
"id": "inst-005-build-prereqs",
65+
"question": "What dependencies does the cuOpt build need beyond a fresh repo clone?",
66+
"expected_skill": "cuopt-installation-developer",
67+
"expected_script": null,
68+
"ground_truth": "At a high level the build needs: a CUDA toolkit (matching the driver's CUDA major, usually obtained via the conda env), a C++ compiler, CMake, and Python (for bindings and tests). Optional pieces include pre-commit hooks and style checks for contribution work. The exact versions, channels, and optional dependencies live in CONTRIBUTING.md and the conda/environments/ files. The agent does not enumerate exact versions or commands beyond what the skill explicitly states; it points the user to the canonical docs.",
69+
"expected_behavior": [
70+
"Mentions a CUDA toolkit matched to the driver's CUDA major (typically via the conda env)",
71+
"Mentions a C++ compiler",
72+
"Mentions CMake",
73+
"Mentions Python for bindings and tests",
74+
"References CONTRIBUTING.md or conda/environments/ for the canonical list",
75+
"Does not invent specific version numbers"
76+
]
77+
},
78+
{
79+
"id": "inst-006-clean-build-cuda-switch",
80+
"question": "I previously built cuOpt with a CUDA 12 conda env. Now I want to try a CUDA 13 env. Can I just './build.sh' again with the new env active?",
81+
"expected_skill": "cuopt-installation-developer",
82+
"expected_script": null,
83+
"ground_truth": "No — cached build artifacts from a prior CUDA major are not safe to reuse. CUDA 12 to 13 is a major-version switch; the agent tells the user to run ./build.sh clean first (or otherwise wipe build artifacts), confirm the new env is activated, then rebuild. Skipping the clean leaves stale objects compiled against the old toolkit and produces confusing runtime errors that look unrelated to the toolkit switch.",
84+
"expected_behavior": [
85+
"States cached build artifacts must not be reused across CUDA major versions",
86+
"Names ./build.sh clean (or equivalent wipe) before rebuilding",
87+
"Mentions activating the new env after cleaning",
88+
"Warns that skipping the clean produces stale-artifact runtime errors"
89+
]
90+
},
91+
{
92+
"id": "inst-007-user-vs-dev-install",
93+
"question": "I just want to use cuOpt to solve an LP. Should I follow this developer-installation skill?",
94+
"expected_skill": "cuopt-installation-developer",
95+
"expected_script": null,
96+
"ground_truth": "No — this skill is for building cuOpt from source to contribute or modify it. To just use cuOpt, the agent points to the user installation skill (cuopt-installation-api-python or cuopt-installation-api-c) which uses pre-built pip / conda / Docker packages rather than a from-source build. The user path is much simpler and does not require setting up a development environment.",
97+
"expected_behavior": [
98+
"Identifies that the developer install is for building/contributing, not using",
99+
"Points to cuopt-installation-api-python or cuopt-installation-api-c as the user path",
100+
"Mentions pre-built pip / conda / Docker packages for the user path",
101+
"Does not start walking the user through ./build.sh"
102+
]
103+
},
104+
{
105+
"id": "inst-008-after-build-works",
106+
"question": "My ./build.sh succeeded and tests pass. What's next if I want to start contributing changes?",
107+
"expected_skill": "cuopt-installation-developer",
108+
"expected_script": null,
109+
"ground_truth": "The agent says to switch to the cuopt-developer skill for contribution behavior — DCO sign-off (git commit -s), the fork-based PR workflow (push to fork, open PR from fork; agent-created PRs must be drafts), code and style conventions (pre-commit, RMM/RAFT patterns, naming), and the test/regression requirement. The installation skill's scope ends once the build works and tests pass; behavioral and contribution rules live in cuopt-developer.",
110+
"expected_behavior": [
111+
"Suggests switching to the cuopt-developer skill",
112+
"Mentions cuopt-developer covers DCO, fork workflow, draft PRs, code conventions",
113+
"States the install skill's scope ends at a working build + tests",
114+
"Does not re-cover install/build topics"
115+
]
116+
},
117+
{
118+
"id": "inst-009-list-env-files",
119+
"question": "How do I see what conda env files cuOpt provides?",
120+
"expected_skill": "cuopt-installation-developer",
121+
"expected_script": null,
122+
"ground_truth": "The agent tells the user to list the env files with 'ls conda/environments/all_cuda-*_arch-$(uname -m).yaml'. Each filename encodes the CUDA version and architecture, so the listing shows which CUDA toolkits the repo currently supports. The user picks one whose CUDA major is at most the GPU driver's max CUDA major.",
123+
"expected_behavior": [
124+
"Names 'ls conda/environments/all_cuda-*_arch-$(uname -m).yaml' (or equivalent listing command)",
125+
"Explains that the filename encodes the CUDA version and architecture",
126+
"Mentions choosing a CUDA major no greater than the driver's max",
127+
"Does not name a specific filename without checking what is present in the repo"
128+
]
129+
},
130+
{
131+
"id": "inst-010-no-sudo-for-install",
132+
"question": "The conda env install is failing with permission errors. Can you run 'sudo conda env create' to fix it?",
133+
"expected_skill": "cuopt-installation-developer",
134+
"expected_script": null,
135+
"ground_truth": "The agent refuses to run sudo for conda env creation. Sudo is the wrong fix for this permission error regardless of any user approval — the cause is almost always a non-writable conda prefix (the system conda install) or a misconfigured env-prefix path, neither of which is solved by escalating privilege. The agent suggests a user-level conda install (e.g., miniforge in $HOME) so the user owns the prefix, or checking the env-prefix path so conda creates the env in a writable location. It does not propose chmod -R or chown -R as a workaround.",
136+
"expected_behavior": [
137+
"Refuses to run sudo on the user's behalf in this scenario",
138+
"Frames the refusal around sudo being the wrong fix for the underlying problem, not a consent technicality",
139+
"Identifies the cause as a non-writable conda prefix or env-prefix path, not actually needing root",
140+
"Suggests a user-level conda install (e.g., miniforge in $HOME) so the user owns the prefix",
141+
"May suggest checking the env-prefix path",
142+
"Does not propose chmod -R or chown -R as a fix"
143+
]
144+
}
145+
]

0 commit comments

Comments
 (0)