This document is the load-bearing design contract for cloud-ai-security-skills. Every future PR is reviewed against it. If you need to deviate, update this doc in the same PR — the contract drifts by design, never by accident.
- Wire format contract — see
../skills/detection-engineering/OCSF_CONTRACT.md - Sink / persistence contract — see
./SINK_CONTRACT.md(lands with PR T) - Runner / streaming contract — see
./RUNNER_CONTRACT.md(lands with PR V) - Runtime isolation and trust boundaries — see
./RUNTIME_ISOLATION.md - SIEM indexing and dedupe guidance — see
./SIEM_INDEX_GUIDE.md - Canonical schema contract — see
./CANONICAL_SCHEMA.md - Raw → canonical → native / OCSF / bridge flow — see
./DATA_FLOW.md - Visual guide — see
./DIAGRAMS.mdfor the architecture and data-flow diagrams in both markdown-native and SVG-friendly form
cloud-ai-security-skills is a library of composable security skills for cloud and AI systems that can operate in native, canonical, OCSF, or bridge modes. The repository is designed to be driven by agentic tools (Claude Code, Snowflake Cortex Code CLI, Claude Agent SDK, any MCP client) and by traditional CI / serverless pipelines — with no code changes between the two modes.
In scope
- Normalising raw vendor telemetry into canonical or OCSF wire formats
- Running deterministic detection rules on canonical or OCSF streams
- Evaluating raw, canonical, or OCSF telemetry against compliance benchmarks (CIS, NIST, PCI)
- Producing remediation proposals (and, when explicitly authorised, executing them)
- Converting OCSF into downstream wire formats (SARIF, Sigma, Jira, Mermaid)
- Persisting canonical or OCSF records into columnar / lakehouse stores (Snowflake, AWS Security Lake, ClickHouse, BigQuery)
- Exposing every skill as an MCP tool so the same logic runs in every agent
Out of scope (explicit non-goals)
- Being a SIEM. SIEMs already ingest OCSF natively (Splunk, Sentinel, Chronicle, Elastic); we are the producer of OCSF, not a replacement for the consumer.
- Running a long-lived multi-tenant SaaS runtime. We ship a skills library + reference runners + reference sinks. Productionising those is the operator's responsibility.
- Letting every skill invent its own internal schema. Source-specific payloads are preserved, the repo normalizes them into a canonical internal model, and OCSF remains an interoperable option rather than a mandatory ceiling.
- Real-time sub-second detection. Latency target is minute-scale batches. If you need sub-second, use a streaming runtime (Flink, Kafka Streams) with these skills as UDFs.
These are the non-negotiables. Everything in §3–§8 exists to serve them.
- Skills are pure functions. Input JSONL → output JSONL. No side effects. No cloud API calls. No disk writes outside stdout. No hidden state.
- Side effects live at the edges. Exactly four categories may have side effects: L0 sources (read raw), L5 remediate (write cloud APIs), L7 sinks (write storage), runners (drive loops). Everything else is pure.
- The schema contract is the shared dependency. Skills never import from each other. If two skills need the same logic, they each own a copy. Copy-paste beats coupling at this scale — the contract is the API, not the Python. For shared pipelines the contract may be OCSF; for stateful inventory and evidence it may be canonical or bridge mode.
- Determinism. Same input always produces the same output. Every finding UID is a content hash; no random UUIDs. Replayable ⇒ testable ⇒ idempotent sink merges.
- Read-only by default. A skill may only perform writes if it is prefixed
remediate-*orsink-*and itsSKILL.mdcarries an explicit "Do NOT use" clause describing the blast radius. - Least-privilege infra. Every skill that talks to a cloud API ships the minimum IAM policy in
infra/iam_policies/. Wildcard actions are a CI failure. - MCP-exposable by default. Every skill must be wrappable as an MCP tool with zero code changes: stdin+args in, stdout out, non-zero exit on error, stderr for warnings. Skills that can't satisfy this don't ship.
- Idempotent sinks. Every sink does
MERGE ... ON finding_info.uid. Re-runs converge. No blind-insert mode. - Dry-run everywhere writes happen.
--dry-runis mandatory for everyremediate-*andsink-*skill. It prints the SQL / API calls it would make without making them. - Audit the auditor. Every sink write and every remediation action emits itself as an OCSF Application Activity (6002) event, so the tool's own actions are findable in the same pipeline it feeds.
The repo cannot assume cloud APIs stay still. Providers deprecate fields, add enum values, rename resources, or change SDK defaults. We treat that as a normal engineering event, not an exceptional one.
- Validate before use. Skills validate untrusted input before cloud calls, parsing, or conversion. Unknown shapes fail closed unless the skill explicitly supports partial-pass handling.
- Use official contracts only.
REFERENCES.mdlinks only to official docs, schemas, benchmarks, or SDK references. If a behavior is not grounded in a credible source, it does not belong in a shipped skill. - Debug cleanly. Structured output goes to
stdout; warnings, skips, and operator hints go tostderr; contract-breaking failures exit non-zero. - Capture drift in tests. When AWS, Azure, GCP, Kubernetes, or another source adds or deprecates fields, we add regression coverage for both the old and new shapes during the migration window.
- Migrate intentionally. Deprecated APIs are removed only after the replacement path is tested, documented, and reflected in
REFERENCES.md.
This is how the repo stays secure and reliable without turning every skill into an over-general abstraction layer.
┌───────────────────────────────────────────────────────────────────┐
│ L0 SOURCES raw vendor formats (CloudTrail, VPC Flow, │
│ Okta, Ramp, Snowflake audit, K8s audit, ...) │
├───────────────────────────────────────────────────────────────────┤
│ L1 INGEST raw → canonical → native / OCSF / bridge │
│ ingest-cloudtrail-ocsf, ingest-ramp-ocsf, ... │
├───────────────────────────────────────────────────────────────────┤
│ L2 DISCOVER / inventory + context │
│ ENRICH discover-environment, discover-ai-bom, │
│ discover-control-evidence, │
│ discover-cloud-control-evidence, │
│ enrich-asset-inventory, enrich-geoip, │
│ enrich-mitre-navigator, enrich-pii-redact │
├───────────────────────────────────────────────────────────────────┤
│ L3 DETECT canonical / OCSF → native / OCSF findings │
│ detect-lateral-movement, detect-* │
├───────────────────────────────────────────────────────────────────┤
│ L4 EVALUATE raw / canonical → compliance result + evidence│
│ cspm-*-cis-benchmark, evaluate-nist-ai-rmf │
├───────────────────────────────────────────────────────────────────┤
│ L5 REMEDIATE Finding → IaC patch / SOAR action │
│ iam-departures-remediation, revoke-key-* │
├───────────────────────────────────────────────────────────────────┤
│ L6 CONVERT OCSF → other wire formats for delivery │
│ convert-ocsf-to-sarif, to-sigma, to-jira, │
│ to-mermaid-attack-flow │
├───────────────────────────────────────────────────────────────────┤
│ L7 SINKS (opt) OCSF → persisted store │
│ sink-snowflake, sink-security-lake, │
│ sink-clickhouse, sink-bigquery │
├───────────────────────────────────────────────────────────────────┤
│ L8 QUERY / VIZ SQL packs + Grafana packs + Cortex prompts │
│ query-mitre-heatmap, cortex-triage-prompts │
├───────────────────────────────────────────────────────────────────┤
│ L9 AGENT SURFACE mcp-server exposes every skill as a tool │
│ → Claude Code, Cortex Code, Agent SDK │
└───────────────────────────────────────────────────────────────────┘
The repo operates across four schema modes:
- native for source-fidelity payloads
- canonical for stable internal storage, joins, metrics, and state
- ocsf for shared pipelines, SIEMs, and standard interoperability
- bridge when OCSF transport helps but native or canonical detail still matters
Most current ingest, detect, evaluate, view, and sink paths are still OCSF-friendly JSONL, but OCSF is no longer the only valid operating mode. Discovery and evidence paths may emit deterministic native or canonical artifacts, plus OCSF bridge events where that improves interoperability. The current discovery set already supports explicit bridge modes for Cloud Resources Inventory Info [5023] and Live Evidence Info [5040].
Execution-mode note:
execution_modes: persistentmeans a skill is safe to embed in a persistent runner, queue consumer, scheduler, or serverless loop without changing the skill logic- it does not mean the repo already ships that runner, daemon, or sink for every skill
- today, the broad runner/sink layer is still planned work;
iam-departures-remediationis the main shipped exception with repo-owned event-driven infrastructure
For the detailed contract, see:
NATIVE_VS_OCSF.mdCANONICAL_SCHEMA.mdDATA_FLOW.mdSTATE_AND_TIMELINE_MODEL.md../skills/detection-engineering/OCSF_CONTRACT.md
For quick orientation, use the visual set in DIAGRAMS.md:
- Runtime surfaces —
runtime-surfaces.svg - Repo architecture —
repo-architecture.svg - IAM departures data flow —
iam-departures-data-flow.svg - Detection pipeline —
detection-pipeline.svg
The rule for this repo is simple: keep the architecture readable in Markdown, and keep polished SVGs in docs/images/ for rendered docs.
| Layer | Category | Status | Skills shipped | Roadmap |
|---|---|---|---|---|
| L0 | sources | (external) | n/a | vendor stories #30–#36, Ramp (PR Y), Snowflake audit (PR Z) |
| L1 | ingest-* |
shipping | cloudtrail, aws-vpc-flow, gcp-vpc-flow, azure-nsg-flow, guardduty, security-hub, gcp-scc, azure-defender, gcp-audit, azure-activity, k8s-audit, mcp-proxy | okta, github, workspace, slack, ramp |
| L2 | discovery/ + enrich-* |
shipping / planned | discover-environment, discover-ai-bom, discover-control-evidence, discover-cloud-control-evidence | PR X (asset-inventory, geoip, mitre-navigator, pii-redact is P0 before any sink in regulated env) |
| L3 | detect-* |
shipping | lateral-movement, privesc-k8s, sensitive-secret-read-k8s, mcp-tool-drift | credential-access per cloud, unusual-assume-role, vector-store-poisoning |
| L4 | evaluate-* |
shipping | cspm-aws/gcp/azure-cis-benchmark, k8s-security-benchmark, container-security | evaluate-cis-aws-foundations (#29), NIST AI RMF, SOC2, PCI |
| L5 | remediate-* |
shipping | iam-departures-remediation | auto-close-exposed-s3, revoke-long-lived-key, patch-inspector-finding |
| L6 | convert-* |
shipping | ocsf-to-sarif, ocsf-to-mermaid-attack-flow | to-sigma, to-splunk-cim, to-jira, to-opa-rego |
| L7 | sinks/ |
planned | none | PR T (snowflake), PR W (security-lake), PR AA (clickhouse), bigquery |
| L8 | query/ + packs |
planned | none | PR T ships the first Cortex query pack alongside sink-snowflake |
| L9 | mcp-server/ |
shipping | thin stdio wrapper | tighter input schemas, HTTP/SSE transport, remediation-safe wrappers |
Skills never change between modes. What changes is what drives them.
Finite input, pipe through skills, write the output somewhere. This is the default mode and the only one required for a working install.
cat cloudtrail.json \
| python3 skills/ingestion/ingest-cloudtrail-ocsf/src/ingest.py \
| python3 skills/detection/detect-lateral-movement/src/detect.py \
| python3 skills/view/convert-ocsf-to-sarif/src/convert.py \
> findings.sarif
Used by: Claude Code ad-hoc analysis, CI, one-off investigations, compliance snapshots, gh pr review automation.
Properties: zero infrastructure, no state, perfectly reproducible, no persistence, single-shot.
A runner (L9 driver, not a skill) drives the skills in a loop from a source queue to a sink. The runner is the only component with state (checkpoint offsets).
S3 notification SQS skill loop sink
─────────────────▶ runner-s3-to-snowflake ─▶ ingest-* → detect-* → sink-snowflake
│
└─ checkpoint state in DynamoDB / Snowflake STREAM
Shipped reference runner:
runners/aws-s3-sqs-detect # S3 -> ingest Lambda -> SQS -> detect Lambda -> DynamoDB dedupe -> SNS
Additional example runners:
runners/runner-s3-to-snowflake # S3 → skill → Snowflake COPY INTO
runners/runner-eventbridge-to-security-lake # EventBridge → skill → Parquet → Security Lake
runners/runner-pubsub-to-clickhouse # Google Pub/Sub → skill → ClickHouse INSERT
runners/runner-eventhubs-to-bigquery # Azure Event Hubs → skill → BigQuery JSON load
Properties: persistent state lives only in the runner (checkpoint) and sink (materialised rows). The skills themselves remain stateless, so failure recovery is: re-drive the runner from the last checkpoint, let idempotent sink merges collapse the duplicates.
An AI BOM capability belongs in the discovery / inventory path, not as the identity of the repo.
- Collection lives near L0/L1 when enumerating models, gateways, vector stores, runtimes, policies, and dependencies from vendor APIs or config exports.
- Normalization should emit OCSF-compatible inventory or application-context records rather than inventing a new private schema.
- Enrichment belongs in L2 when joining model inventory, framework metadata, package provenance, and control coverage into a usable graph.
- Evaluation belongs in L4 when mapping that inventory to MITRE ATLAS, NIST AI RMF, OWASP LLM Top 10, or other AI-security frameworks.
That keeps AI BOM as one valuable skill family inside cloud-ai-security-skills, instead of pulling the whole repo away from its broader cloud + AI security scope.
Every OCSF event we emit carries a deterministic UID derived from content, never from wall clock or RNG. Today:
- Findings:
finding_info.uid = det-<rule>-<short(semantic_key)>. Example:det-lm-1f455f51-cbef99b7-9ea97278is the (provider, session, dst-ip, dst-port) hash for the cross-cloud lateral-movement rule. Running the same input a thousand times yields the same uid. - Ingested events: inherit the source event's immutable ID (
eventIDfor CloudTrail,Idfor GuardDuty, etc.). Ingest is content-addressable.
Sinks exploit this: MERGE INTO ocsf_findings USING input ON input.finding_info.uid = target.finding_info.uid WHEN MATCHED THEN UPDATE SET ... WHEN NOT MATCHED THEN INSERT .... Replaying a day's worth of raw events after a sink outage converges to the same table state.
cloud-ai-security-skills/
├── skills/
│ ├── ingestion/ # L1
│ ├── discovery/ # L2 discovery / inventory
│ ├── detection/ # L3
│ ├── evaluation/ # L4
│ ├── view/ # L6
│ ├── remediation/ # L5
│ └── detection-engineering/ # shared OCSF contract + golden fixtures
├── tests/integration/
├── .github/workflows/
└── docs/
└── ARCHITECTURE.md (this file)
cloud-ai-security-skills/
├── skills/
│ ├── ingestion/ # L1
│ ├── discovery/ # L2
│ ├── detection/ # L3
│ ├── evaluation/ # L4
│ ├── view/ # L6
│ └── remediation/ # L5
├── sinks/ # L7 — own top-level, side-effectful
│ ├── sink-snowflake-ocsf/
│ ├── sink-security-lake-ocsf/
│ └── sink-clickhouse-ocsf/
├── runners/ # Mode B drivers
│ ├── runner-s3-to-snowflake/
│ └── runner-eventbridge-to-security-lake/
├── mcp-server/ # L9 — single cross-cutting server
│ ├── src/server.py
│ ├── src/tool_registry.py
│ └── tests/
├── query/ # L8 — SQL packs, Cortex prompts, Grafana JSON
│ ├── snowflake/
│ ├── clickhouse/
│ └── grafana/
├── tests/integration/
└── docs/
├── ARCHITECTURE.md
├── OCSF_CONTRACT.md # optional future move from skills/detection-engineering/
├── SINK_CONTRACT.md # new, PR T
└── RUNNER_CONTRACT.md # new, PR V
Rationale for separating sinks/, runners/, mcp-server/, query/ from skills/: the "skills are pure, edges have side effects" mental model becomes visible in the directory tree. A reviewer can tell at a glance whether a change touches pure code or effectful code. This is cheap documentation that pays for itself on every PR.
The layered skill directories are now canonical. skills/detection-engineering/
remains only as the shared OCSF contract and frozen-fixture namespace used by
ingestion, detection, and view skills.
See ../skills/detection-engineering/OCSF_CONTRACT.md for the field-level contract every event must satisfy. Summary:
- Base schema: OCSF 1.8.0, no exceptions.
- Wire format: JSONL, UTF-8, LF, no BOM.
- Transport: stdin / stdout by default,
--input/--outputoptional. - Error handling: malformed lines are skipped with a stderr warning — never fatal. Detection pipelines must not crash on one bad event.
- Detection findings: class 2004 (
Detection Finding). Class 2001 (Security Finding) is deprecated since OCSF 1.1 and forbidden in this repo. - MITRE ATT&CK: version v14, pinned.
attacks[]lives insidefinding_info, not at the event root. - Custom fields: forbidden at the base level. Custom MCP-specific fields live under a documented profile extension (
cloud_security_mcp).
| Event kind | UID derivation |
|---|---|
| Ingested (L1) | Inherit source event's immutable ID (CloudTrail eventID, GuardDuty Id, ASFF Id, K8s audit auditID). |
| Detection finding (L3) | det-<rule-slug>-<short-sha256(semantic-key)> where semantic-key is the tuple of observables that defines "the same finding" (session + dst for lateral movement, cluster + subject for k8s privesc, etc.). |
| Compliance finding (L4) | eval-<benchmark>-<control-id>-<target-uid>. Same target evaluated twice must yield the same UID. |
| Remediation action (L5) | remediate-<action>-<target-uid>-<ts-day>. Day-bucketed so the same day's replay dedupes but tomorrow's re-run creates a new audit record. |
| Sink audit (L7) | sink-<sink-name>-<input-uid>. Emitted as an OCSF 6002 self-audit event per rule 10. |
Each rule, the reason, and the concrete enforcement mechanism.
-
Pure functions. Enforcement: CI runs
banditwithB310(urllib) andB605(subprocess) enabled on everything underskills/exceptremediate-*,sink-*, andrunners/. Any network or subprocess call outside those categories fails the build. -
Side effects live at the edges. Enforcement:
CODEOWNERSrequires a security reviewer on any PR that touchesremediate-*/,sink-*/, orrunners/. Pure-skill PRs don't need this. -
No cross-skill imports. Enforcement: CI lint rule — any
from ..orfrom ...import insideskills/*/*/src/fails. A skill may import only from its ownsrc/or from the Python stdlib. -
Determinism. Enforcement: every skill ships golden fixtures and a
test_deep_eq_against_frozen_goldentest. Re-running the skill against the raw input must produce byte-identical output to the frozen OCSF JSONL. -
Read-only by default. Enforcement:
SKILL.mdfrontmatter has a requiredcapabilityfield. Valid values:read-only,write-remediation,write-sink,write-runner. CI rejects a skill with nocapabilityfield. -
Least-privilege IAM. Enforcement: every
remediate-*andsink-*skill shipsinfra/iam_policies/*.json. CI runsiam-policy-lintto reject wildcardAction: "*"or wildcardResource: "*"unless the policy carries an explicit# WILDCARD_OK: reasoncomment. -
MCP-exposable. Enforcement: the
mcp-server/test suite auto-discovers every skill and calls it via the MCP protocol with its golden input. Any skill that doesn't round-trip through MCP fails the build. -
Idempotent sinks. Enforcement:
SINK_CONTRACT.mddefines the required MERGE key (finding_info.uidfor finding classes, source event ID for ingested classes). A sink's test suite must run the golden fixture twice and assert the target row count is unchanged on the second run. -
Dry-run everywhere writes happen. Enforcement: every
remediate-*andsink-*skill must accept--dry-runand its test suite must verify zero writes occur when the flag is set. -
Audit the auditor. Enforcement: every sink write emits an OCSF 6002 Application Activity event to a separate
sink_auditstream. Grafana / Cortex has a dashboard pack that surfaces "who wrote what when".
This section is the heart of the design. Without it, each agent (Claude Code, Cortex Code, Agent SDK, custom) would need per-agent glue. With it, writing a new skill automatically makes it available to every agent.
┌──────────────────┐
│ Claude Code │── Bash tool (direct) ──────┐
├──────────────────┤ │
│ Claude Code │── MCP client ───┐ │
├──────────────────┤ │ │
│ Cortex Code CLI │── MCP client ───┤ │
├──────────────────┤ │ │
│ Claude Agent SDK│── MCP client ───┼──────► mcp-server ◄── auto-discovers
├──────────────────┤ │ │ skills/*/src/*.py
│ custom agent │── MCP stdio ────┘ │
└──────────────────┘ ▼
subprocess invoke
(stdin JSONL in,
stdout JSONL out)
- Auto-discovery — walks
skills/*/*/SKILL.md, parses the frontmatter (name,description), classifies the skill (read-onlyvs write-capable), and resolves a fixed entrypoint (src/<ingest|detect|checks|convert|discover>.py). - Tool spec generation — each skill becomes one MCP tool. Tool name =
SKILL.mdnamefield. Tool description = the full frontmatterdescription(which already leads with "Use when…" and closes with "Do NOT use…" per the Anthropic pattern). - Input schema — current implementation exposes a conservative
input: string(stdin payload) andargs: string[](fixed CLI args passed to the skill entrypoint). This keeps the wrapper thin and avoids inventing a second API surface. Tighter CLI-derived schemas are a follow-up. - Invocation — the server shells out to
python3 <skill-path>/src/<entry>.py, streams the tool'sinputto the skill's stdin, captures stdout, and wraps non-zero exits as MCP tool errors with stderr attached. - Transport — stdio (local) ships now. HTTP/SSE stays a follow-up for hosted deployments.
- Claude Code can call skills via Bash or MCP — flexibility is free.
- Cortex Code CLI supports only MCP (no general Bash tool), so MCP is the required path for Snowflake-agent coverage.
- Claude Agent SDK has first-class MCP support.
- Any custom agent with an MCP client (growing list) gets the library for free.
- MCP is an open protocol; we incur no vendor lock-in by adopting it.
- It does not make skills stateful. The server is a thin wrapper; state still lives only in sinks and runners.
- It does not aggregate results. A multi-step analysis is still a composition of tool calls orchestrated by the agent, not a monolithic "analyze_everything" endpoint.
- It does not authenticate end users. Auth is the embedding application's responsibility — the MCP server trusts its caller.
This section governs what happens the moment the repo graduates from "Unix filter toy" to "persistent security telemetry pipeline" (i.e. any time a sink or runner is deployed).
| Concern | Control | Enforced where |
|---|---|---|
| Encryption at rest | Native: Snowflake column encryption, Security Lake S3 SSE-KMS, ClickHouse disk encryption | Sink DDL ships with encryption enabled; CI rejects unencrypted table definitions |
| Encryption in transit | TLS 1.2+ on every connection; sinks reject plaintext | Sink connection code validates scheme |
| Credentials | Vault / Secrets Manager / KMS-wrapped only. Sinks refuse to run if a literal long-lived key is in env. | Sink startup check |
| IAM | Per-sink role: CREATE TABLE + MERGE INTO on ocsf_* schema, nothing else. Policy shipped in infra/iam_policies/<sink-name>.json. |
iam-policy-lint CI check |
| Multi-tenancy | Row-level security on cloud.account.uid. Snowflake RLS policy, ClickHouse row policy, Security Lake S3 prefix partitioning. |
Sink schema ships the RLS / row-policy DDL |
| Retention | Storage-layer TTL. Default 90d hot, auto-archive to 7y for audit classes. | DDL |
| PII | Field-level redaction hook at L2 (enrich-pii-redact-ocsf, PR X-P0) must run before any sink in a regulated environment. Config file defines redaction rules per field path. |
Runner refuses to start a pipeline targeting a sink without PII redaction enabled, unless --regulated=false is set explicitly |
| Tamper detection | Sinks sign each batch with a rolling HMAC over a key in KMS. Downstream query-tamper-check skill detects gaps and signature mismatches. |
SINK_CONTRACT.md, PR T |
| Separation of duties | sink-* skills run with write-only creds (no SELECT grant). query-* skills run with read-only creds. They never share a role. |
IAM policies |
| Audit | Every sink write and every remediation action emits an OCSF 6002 self-audit event into a separate sink_audit stream. |
Rule 10 |
| Network | Sinks must tolerate egress restrictions — all use official vendor SDKs which honour VPC endpoints / private links. | Test suite runs sinks with http_proxy=invalid and asserts they use the SDK, not direct HTTP |
This is the property that lets Mode A and Mode B share the same skill code. Worked example using detect-lateral-movement:
Input — one CloudTrail AssumeRole event + one VPC Flow ACCEPT to 10.0.3.75:3306, same session.
Deterministic UID — rule computes session_uid = "ASIASESSION001", dst_key = "10.0.3.75:3306", then uid = "det-aws-lm-" + sha256(session_uid)[:8] + "-" + sha256(dst_key)[:8]. This yields det-aws-lm-cbef99b7-9ea97278 regardless of when or where the rule runs.
Replay scenarios:
| Scenario | Sink behaviour |
|---|---|
| Raw events fed once, sink healthy | 1 row inserted |
| Raw events fed again (Mode B retry after outage) | MERGE matches on uid → UPDATE (no-op if payload unchanged) |
| Raw events fed with a new flow at 5 min 10s (second flow, same session, same dst) | Deduped in-skill by (session, dst_ip, dst_port) — still 1 uid, still 1 row |
| Raw events fed with a flow to a different dst | New uid, new row |
| Day's worth of raw events replayed from S3 after a 6h sink outage | Every finding's MERGE key already exists → 0 duplicates, converged state |
This is what "OCSF can persist + update" means in practice: every finding is already an upsert key. The sink never needs a generated primary key.
- Pick the layer (L1 ingest, L2 enrich, L3 detect, L4 evaluate, L5 remediate, L6 convert).
- Copy the nearest sibling as the starting point — the existing skills in the target category are the canonical reference.
- Write
SKILL.mdwith the spec-compliant frontmatter, leadingdescriptionwith "Use when…" and closing with "Do NOT use…". Include thecapabilityfield. - Implement
src/<entry>.pyas a pure Unix filter: stdin JSONL in, stdout JSONL out, stderr for warnings. - Write tests in
tests/test_<entry>.py. Include:- Unit tests for every helper.
- Positive golden-fixture parity tests.
- Negative controls (at least 3, explaining what should not fire).
- Register the skill in
.github/workflows/ci.ymlmatrix. - If the skill adds a new MITRE technique, add it to
OCSF_CONTRACT.md's pinned table. - Run
pytest,ruff check,ruff format, open a PR.
- Create
sinks/sink-<name>-ocsf/with the same SKILL.md + src + tests + infra layout. - Ship
infra/schema.sqlwith encrypted, RLS-policied DDL. - Ship
infra/iam_policies/sink-write.jsonwith minimum grants. - Implement a pure mode A pathway:
cat ocsf.jsonl | python src/load.py --dry-runprints the exact SQL / API calls. - Implement the wet pathway behind
--dry-run=false. - Add two idempotency tests: same input run twice must yield unchanged row count.
- Add a sink-audit test: confirm the sink emits an OCSF 6002 event about itself.
- Add CI matrix entry.
- Create
runners/runner-<source>-to-<sink>/with SKILL.md + src + tests. - Implement checkpoint persistence (DynamoDB, Snowflake STREAM, GCS blob, Azure Storage — pick one per runner).
- Runner must be idempotent against checkpoint replay: running from
checkpoint_n - 10 batchesmust converge. - Test with a fake source queue and a fake sink that records its MERGE operations.
- The wire contract (OCSF version + custom profile version + MITRE version) is pinned in
OCSF_CONTRACT.mdundercontract version: 1.8.0+mcp.2026.04. To bump: cut a new contract version and re-freeze every skill's golden fixtures in one PR. We do not mix contract versions across skills. - Skill implementations are versioned by git commit. A skill has no version number; its behaviour is defined by its test fixtures, which are frozen.
- Repo releases cut tagged versions (
v0.N.M) that snapshot a known-good combination of contract + skills + sinks + runners. CI smoke-tests the release's end-to-end pipes before the tag is pushed. - Breaking changes to the contract are allowed at minor-version boundaries, must update every skill in the same PR, and must ship a migration note in
docs/MIGRATIONS.md.
This is the architectural roadmap. Vendor-story PRs (#29–#39) continue in parallel and land on whatever the top-of-stack layout is.
| PR | Scope | Priority | Why |
|---|---|---|---|
| ARCHITECTURE.md | This document | P0 | Locks the design so every subsequent PR lands correctly |
| PR #39 reshape | git mv skills into ingest-*, evaluate-*, remediate-*, etc. Add empty sinks/ runners/ mcp-server/ query/ dirs with READMEs. |
P0 | Without this, later PRs land in the wrong place and we pay for it twice |
| PR U — mcp-server | Auto-discover every skill, expose via MCP stdio + HTTP/SSE, test against Claude Code + Cortex Code CLI | P0 | The single highest-leverage PR: unlocks every agent for every future skill |
| PR T — sink-snowflake-ocsf | Schema DDL, COPY INTO loader, idempotent MERGE, Cortex query pack, Cortex Analyst semantic model | P0 | Proves the persistent mode works end-to-end, directly enables Cortex Code CLI users |
| PR X — enrich-pii-redact-ocsf | Field-level redaction with rule config, gates every sink run in regulated mode | P0 | Regulatory prerequisite — no sink runs in a regulated environment without it |
| PR V — first runner | runner-s3-to-snowflake with checkpoint state, idempotency tests |
P1 | Proves streaming mode works. Unlocks continuous pipeline |
| PR W — sink-security-lake-ocsf | OCSF Parquet with AWS Security Lake partitioning (zero-transform target) | P1 | Most strategic sink — AWS Security Lake is OCSF-native |
| PR Y — Ramp vendor story | ingest-ramp-ocsf + detect-ramp-vendor-change-with-payment + detect-ramp-spend-limit-bypass + detect-ramp-card-to-unknown-merchant |
P1 | First non-cloud, business-logic vendor story |
| PR Z — ingest-snowflake-audit-ocsf | Snowflake ACCOUNT_USAGE audit into OCSF |
P2 | Closes the loop: Cortex Code users can detect on their own platform |
| PR AA — sink-clickhouse-ocsf | MergeTree DDL, Nested(…) for attacks[] and observables[], loader, Grafana dashboard pack |
P2 | Best OLAP economics for high-cardinality teams |
| PR BB — enrich-asset-inventory-ocsf | Joins findings with discover-environment graph |
P2 | Turns findings into triageable blast-radius context |
To keep this doc honest:
- Not a SIEM. Splunk, Sentinel, Chronicle, and Elastic already consume OCSF natively. If you have one of those, pipe our output into it — we are the producer.
- Not a runtime. We ship a library + reference runners + reference sinks. Running a 24/7 multi-tenant production pipeline is an integration task the operator owns.
- Not a rule engine. We ship deterministic Python rules. If you need a declarative DSL, convert our OCSF to Sigma and load it into your DSL engine.
- Not a UI.
query/ships SQL packs and Grafana dashboards; we do not ship a proprietary web UI. Grafana + Cortex are the "UI". - Not a SOAR.
remediate-*skills propose fixes. Production SOAR orchestration is the operator's responsibility. - Not a replacement for official SDKs. Sinks use the official vendor SDK (Snowflake Python connector,
boto3,google-cloud-*,azure-*). We do not reimplement API clients.
| Term | Meaning |
|---|---|
| Skill | A self-contained Anthropic-spec skill bundle: SKILL.md + src/ + tests/ + optional REFERENCES.md + optional infra/. One purpose, one wire contract. |
| Layer | A logical stage in the pipeline (L0–L9). Each layer has exactly one category of side-effect profile. |
| Sink | A side-effectful skill that persists OCSF to external storage. Lives in sinks/, not skills/. |
| Runner | A driver process that runs skills in a loop from a source queue to a sink. Lives in runners/, not skills/. |
| Wire contract | The OCSF 1.8 + MITRE v14 + MCP-profile pinning in OCSF_CONTRACT.md. The only shared dependency. |
| MCP tool | An MCP-protocol-exposed callable. Every skill is automatically one, via mcp-server/. |
| Mode A | Batch execution. Finite input, Unix pipes, no persistence. The default. |
| Mode B | Streaming execution. Runner drives skills in a loop, state lives only in the runner checkpoint and the sink. |
| Deterministic UID | A content-addressed identifier. Same semantic input → same UID. The property that makes idempotent sinks work. |
| Read-only by default | The baseline posture: no skill performs writes unless it is remediate-* or sink-* and has an explicit "Do NOT use" clause. |
Changelog — update this section in every PR that amends this file.
| Version | Date | Change |
|---|---|---|
| 1.0 | 2026-04-11 | Initial architecture document. Codifies the 9-layer model, two execution modes, 10 guardrails, MCP integration strategy, security posture, and the dependency-ordered roadmap. |