-
-
Notifications
You must be signed in to change notification settings - Fork 454
docs: restructure agent instruction files #2453
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,69 +1,49 @@ | ||
| # AGENTS.md | ||
|
|
||
| Agent-focused notes for the **giskard-oss** monorepo. Human-oriented docs live in [README.md](README.md) and [https://docs.giskard.ai/oss](https://docs.giskard.ai/oss). | ||
|
|
||
| ## Project overview | ||
|
|
||
| - **Stack:** Python 3.12+ (CI also exercises 3.13–3.14), [uv](https://docs.astral.sh/uv/) for env and runs, [Ruff](https://docs.astral.sh/ruff/) for lint/format, [basedpyright](https://github.com/DetachHead/basedpyright) for types. | ||
| - **Layout:** Libraries under `libs/` — `giskard-core`, `giskard-agents`, `giskard-checks`, `giskard-llm`. Work from the **repository root** unless a task is scoped to one package. | ||
| - **Package-specific conventions:** See `libs/<package>/.cursor/rules/` (e.g. `giskard-checks` has project and development rules). | ||
| - **Always invoke Python via `uv run`** — bare `python` or `pytest` will fail with `ModuleNotFoundError`. | ||
|
|
||
| ## Setup | ||
|
|
||
| **Recommended for coding agents:** run `make setup-for-agents` from the repo root. It runs the same steps as `make setup`, writes `GISKARD_TELEMETRY_DISABLED=false` to `.env`, and sends a one-shot analytics event. Optional context (both optional): | ||
|
|
||
| ```bash | ||
| make setup-for-agents AGENT_NAME="ClawBot" REASON="Working on issue #1234" | ||
| ``` | ||
|
|
||
| Human or minimal setup without that target: | ||
|
|
||
| ```bash | ||
| make install # uv sync — project dependencies | ||
| make install-tools # ruff, vermin, basedpyright, pre-commit (with pre-commit-uv) | ||
| make pre-commit-install # optional: git hooks | ||
| # or all of the above: | ||
| make setup | ||
| ``` | ||
|
|
||
| If `make check` fails with missing tools, run `make install` and `make install-tools` (same order as CI). | ||
|
|
||
| ## Commands (run from repo root) | ||
|
|
||
| | Goal | Command | | ||
| |------|---------| | ||
| | Format (Ruff format + fix) | `make format` | | ||
| | Lint | `make lint` | | ||
| | Full gate (lint, format check, Python 3.12 compat, types, security, licenses) | `make check` | | ||
| | Unit tests (all libs) | `make test-unit` | | ||
| | Unit tests for one package | `make test-unit PACKAGE=giskard-checks` (also `giskard-core`, `giskard-agents`) | | ||
| | All tests including functional | `make test` | | ||
| | Functional tests only | `make test-functional` | | ||
|
|
||
| CI (`.github/workflows/ci.yml`) runs `make install install-tools`, then `make check`, then `make test-unit` per package — mirror that before opening or updating a PR. | ||
|
|
||
| ## Functional / integration tests | ||
|
|
||
| - `make test-functional` and full `make test` call live APIs; they are **not** the default PR gate in `ci.yml` (see `.github/workflows/integration-tests.yml` for secrets-driven runs). | ||
| - Local: create a **repo-root** `.env` (gitignored). Export vars before pytest, e.g. `set -a && source .env && set +a` then `make test-functional PACKAGE=giskard-agents`. Typical vars include `GEMINI_API_KEY`; optional `TEST_MODEL`, `TEST_EMBEDDING_MODEL` (see `libs/giskard-agents/tests/conftest.py`). | ||
|
|
||
| ## PR / change discipline | ||
|
|
||
| **Pre-flight checklist** (run in order before opening or updating a PR): | ||
|
|
||
| ```bash | ||
| make format # Ruff format + autofix | ||
| make check # lint, type-check, compat, security, licenses | ||
| make test-unit # or: make test-unit PACKAGE=giskard-checks | ||
| ``` | ||
|
|
||
| If `make check` fails with missing tools: run `make install && make install-tools` first. | ||
|
|
||
| **Rules:** | ||
|
|
||
| - End **agent-opened** PR titles with `🤖🤖🤖🤖` — required for the expedited-agent PR workflow. Do not omit. | ||
| - Minimal diffs only: implement exactly what was asked; no drive-by refactors or unrelated file edits. | ||
| - Do not add unsolicited features, architecture docs, or codemap files. | ||
|
|
||
| giskard-oss — behavioral config for autonomous coding agents (no human in the loop). Human-oriented docs: [README.md](README.md). Always invoke Python via `uv run` — bare `python` or `pytest` will fail. | ||
|
|
||
| ## Workflow Orchestration | ||
|
|
||
| ### 1. Setup | ||
| Run once: `make setup-for-agents AGENT_NAME="<name>" REASON="<issue or task>"` | ||
| If tools are missing later: `make install && make install-tools` | ||
|
|
||
| ### 2. Plan Before Acting | ||
| – Write approach to tasks/todo.md before touching any file | ||
| – If scope is unclear, read `libs/<pkg>/.cursor/rules/` before proceeding | ||
| – If something goes wrong, stop and re-plan | ||
|
|
||
| ### 3. Verification Before Done | ||
| – Before any PR: `make format && make check && make test-unit PACKAGE=<affected-lib>` | ||
| – Show `make check` and pytest summary output in the PR description | ||
| – No `# type: ignore` without structural-fix explanation; no patching test assertions | ||
|
|
||
| ### 4. PR Rules | ||
| – End PR titles with `🤖🤖🤖🤖` — required for the expedited-agent PR workflow | ||
| – Minimal diffs: implement exactly what was asked | ||
|
|
||
| ### 5. Self-Improvement Loop | ||
| – After any mistake: update lessons.md with what went wrong and what rule would have prevented it | ||
|
|
||
| ### 6. Clarify Before Acting | ||
| – When an issue is ambiguous, contradictory, or missing acceptance criteria: **do not open a PR**. Post a comment on the issue asking the specific questions needed to proceed, then stop. | ||
| – When you have a better approach than what was requested: **do not implement your alternative silently**. Comment on the issue explaining your suggestion and the trade-offs, then wait for confirmation. | ||
| – When responding to a PR review comment: if the requested change is unclear, you disagree, or you see a better path: **comment back with your question or counter-proposal**, do not just apply the change blindly. | ||
| – One comment is enough — do not loop. If no response comes, remain stopped. | ||
| – When scope is unambiguous and you have no better suggestion: proceed directly without asking. | ||
|
|
||
| ## Task Management | ||
| 1. Plan First — write plan to tasks/todo.md | ||
| 2. Verify Plan — review before starting; proceed if unambiguous | ||
| 3. Track Progress — mark items complete as you go | ||
| 4. Explain Changes — high-level summary at each step | ||
| 5. Document Results — add review section to todo.md | ||
| 6. Capture Lessons — update lessons.md after corrections | ||
|
|
||
| ## Core Principles | ||
| – Simplicity First: make every change as simple as possible; prefer deleting lines over adding them | ||
| – No Laziness: find root causes; no band-aids, no temporary fixes; senior developer standards | ||
| – Minimal Impact: only touch what's necessary; no side effects; no reformatting untouched lines | ||
|
|
||
| --- | ||
| This file follows the open [AGENTS.md](https://agents.md/) convention for coding agents. | ||
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -1,44 +1,47 @@ | ||||||
| # CLAUDE.md | ||||||
|
|
||||||
| Claude-specific notes for the **giskard-oss** monorepo. Human-oriented docs live in [README.md](README.md); agent-oriented operational notes live in [AGENTS.md](AGENTS.md). | ||||||
|
|
||||||
| ## Project overview | ||||||
|
|
||||||
| - **Stack:** Python 3.12+, [uv](https://docs.astral.sh/uv/) for env/runs, [Ruff](https://docs.astral.sh/ruff/) for lint/format, [basedpyright](https://github.com/DetachHead/basedpyright) for types. | ||||||
| - **Layout:** Libraries under `libs/` — `giskard-core`, `giskard-agents`, `giskard-checks`, `giskard-llm`. Work from the repo root unless a task is strictly scoped to one package. | ||||||
|
|
||||||
| ## Setup | ||||||
|
|
||||||
| ```bash | ||||||
| make setup # uv sync + install tools + pre-commit hooks | ||||||
| # or, for coding agents: | ||||||
| make setup-for-agents | ||||||
| ``` | ||||||
|
|
||||||
| ## Commands | ||||||
|
|
||||||
| | Goal | Command | | ||||||
| |------|---------| | ||||||
| | Format | `make format` | | ||||||
| | Lint | `make lint` | | ||||||
| | Full gate | `make check` | | ||||||
| | Unit tests (all) | `make test-unit` | | ||||||
| | Unit tests (one package) | `make test-unit PACKAGE=giskard-checks` | | ||||||
| | All tests | `make test` | | ||||||
|
|
||||||
| Before opening or updating a PR: `make format && make check && make test-unit`. | ||||||
|
|
||||||
| ## PR discipline | ||||||
|
|
||||||
| - End agent-opened PR titles with `🤖🤖🤖🤖` — required for the expedited-agent PR workflow. | ||||||
| - Always run `uv run` (not bare `python` or `pytest`). | ||||||
|
|
||||||
| --- | ||||||
|
|
||||||
| ## Core principles | ||||||
|
|
||||||
| **Make every change as small as possible.** Prefer deleting lines over adding them. The best fix often removes code entirely. | ||||||
|
|
||||||
| **Find the root cause.** No band-aids. If a symptom has an underlying cause, address that — not the symptom. | ||||||
|
|
||||||
| **Only touch what's necessary.** No side effects. Leave every file you didn't need to change exactly as you found it. | ||||||
| giskard-oss — behavioral config for Claude Code (interactive assistant with a human in the loop). | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The header is missing the link to
Suggested change
|
||||||
|
|
||||||
| ## Workflow Orchestration | ||||||
|
|
||||||
| ### 1. Plan Mode Default | ||||||
| – Enter plan mode for ANY non-trivial task (3+ steps or touches multiple libs) | ||||||
| – Write the approach to tasks/todo.md before touching files | ||||||
| – If something goes sideways, STOP and re-plan immediately | ||||||
|
|
||||||
| ### 2. Subagent Strategy | ||||||
| – Use subagents to keep the main context window clean | ||||||
| – One focused task per subagent | ||||||
| – For complex cross-lib problems, spawn parallel subagents | ||||||
|
|
||||||
| ### 3. Self-Improvement Loop | ||||||
| – After ANY correction: update lessons.md | ||||||
| – Write a rule that prevents the same mistake | ||||||
| – Propose additions to CLAUDE.md in the format "Add rule: X" | ||||||
|
|
||||||
| ### 4. Verification Before Done | ||||||
| – Never mark a task complete without proving it works | ||||||
| – Run: `make format && make check && make test-unit PACKAGE=<affected-lib>` | ||||||
| – Show actual output. Never assume tests pass. | ||||||
| – Ask yourself: "Would a staff engineer approve this?" | ||||||
|
|
||||||
| ### 5. Demand Elegance | ||||||
| – Pause and ask "is there a more elegant solution?" | ||||||
| – Skip for simple fixes — don't over-engineer | ||||||
|
|
||||||
| ### 6. Autonomous Bug Fixing | ||||||
| – When given a bug report with clear scope: just fix it | ||||||
| – No `# type: ignore`, no patched test assertions — fix the root cause | ||||||
|
|
||||||
| ## Task Management | ||||||
| 1. Plan First — write plan to tasks/todo.md | ||||||
| 2. Verify Plan — check in before starting | ||||||
| 3. Track Progress — mark items complete as you go | ||||||
| 4. Explain Changes — high-level summary at each step | ||||||
| 5. Document Results — add review section to todo.md | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||||||
| 6. Capture Lessons — update lessons.md after corrections | ||||||
|
|
||||||
| ## Core Principles | ||||||
| – Simplicity First: make every change as simple as possible; prefer deleting lines over adding them | ||||||
| – No Laziness: find root causes; no band-aids, no temporary fixes; senior developer standards | ||||||
| – Minimal Impact: only touch what's necessary; no side effects; no reformatting untouched lines | ||||||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,8 @@ | ||
| # Lessons | ||
|
|
||
| One entry per corrected mistake. Format: date, what went wrong, rule that prevents it. | ||
|
|
||
| --- | ||
|
|
||
| **2026-05-15** — Assumed CLAUDE.md needed to mirror AGENTS.md structure. | ||
| Rule: CLAUDE.md and AGENTS.md serve different audiences and are independent — don't force them to share structure or cross-reference each other. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The path to the todo file is inconsistent with other references in this file (lines 12 and 33). It should be updated to
tasks/todo.mdto ensure clarity for automated agents.