Make kernel crash validation use a real QEMU guest path#139
Merged
peaktwilight merged 6 commits intomainfrom Apr 13, 2026
Merged
Make kernel crash validation use a real QEMU guest path#139peaktwilight merged 6 commits intomainfrom
peaktwilight merged 6 commits intomainfrom
Conversation
Replace the fragile SSH transport with a 9p shared-folder execution path so the QEMU/KASAN verifier can compile and run reproducers without guest network bring-up. The guest now boots through pwnkit-init, mounts a host-provided share, executes a generated runner script, and writes compile/run artifacts back for host-side collection. The rootfs builder was updated to include the tools this path actually needs, and the docs/build guidance now match the implementation. Constraint: The verifier must run on hosts where guest SSH is unreliable or unavailable under KASAN/UBSAN noise Rejected: Keep debugging guest SSH | transport remained flaky despite successful boot and custom init execution Rejected: Full kernel rebuild for every rootfs tweak | too slow once existing artifact config already included 9p virtio support Confidence: medium Scope-risk: moderate Reversibility: clean Directive: Crash-matching logic still needs refinement for noisy UBSAN-first outputs; do not treat transport success as signature-match success Tested: pnpm build; real QEMU boot with shared-folder guest script; real syzbot ingest --verify compile+run against RDMA invalid-free sample Not-tested: Additional real syzbot crash families beyond the exercised invalid-free sample
Create a dedicated GitHub Actions workflow that builds the kernel VM artifacts, boots QEMU, and runs ingest --verify against a real syzbot crash/reproducer pair. The workflow keeps preserved VM artifacts for debugging, and the runner now supports an optional artifact directory so CI can upload serial logs, compile logs, and dmesg instead of deleting them. The shell harness also normalizes the current CLI JSON output and asserts the transport actually reproduced a real guest run. Constraint: We need a real QEMU-backed CI proof path without making every unrelated push pay for a heavyweight kernel VM build Rejected: Fold the VM run into the default CI job | too expensive and too broad for unrelated changes Rejected: Assert verified=true today | current matcher still mis-scores noisy real outputs even when the VM transport succeeds Confidence: high Scope-risk: moderate Reversibility: clean Directive: Keep this workflow path-scoped or manual unless signature matching becomes stable enough to justify a broader gate Tested: pnpm build; local execution of scripts/kernel-validator-e2e.sh against the rebuilt VM artifacts; reproduced=true on real syzbot sample Not-tested: GitHub-hosted runner wall-clock for the full workflow end-to-end
The repository docs still described the kernel validator as SSH-first even after the transport was replaced with a host-shared QEMU execution path. This pass updates the root README, kernel VM README, builder script comments, and Dockerfile header comments so the documented behavior matches the actual guest boot flow and CI lane. Constraint: The branch now includes a real QEMU guest runner and a dedicated CI workflow, so stale SSH wording would mislead anyone trying to use or review it Rejected: Leave the old SSH wording in place until later | would make the new workflow and docs contradict each other immediately Confidence: high Scope-risk: narrow Reversibility: clean Directive: If SSH is ever removed from the rootfs entirely, also delete the remaining exported key artifacts rather than leaving dead outputs around Tested: Prior pnpm build still applies; docs/comment-only pass Not-tested: Fresh docs site build after this wording-only cleanup
The kernel-validator workflow was spending most of its time rebuilding the VM artifact bundle on every PR run, even when the kernel VM inputs had not changed. Restore and save the exported bzImage/rootfs bundle via actions/cache keyed on the kernel VM Dockerfile and build script. This keeps the real QEMU E2E gate intact while making repeat PR runs and reruns materially faster. Constraint: The workflow still needs to exercise a real guest boot, but repeated full kernel builds on every rerun are too expensive for normal iteration Rejected: Drop the artifact build step entirely | would remove the guarantee that the tested VM inputs match the branch contents Confidence: high Scope-risk: narrow Reversibility: clean Directive: If the artifact cache proves too large or flaky, switch to a dedicated artifact-producing workflow instead of broadening the default CI path Tested: Workflow YAML inspection; cache key/path wiring review Not-tested: End-to-end GitHub cache hit on a second run yet
GitHub rejected the cache-enabled workflow before job creation because runner.temp was referenced from jobs.<job>.env, where the runner context is not available. Move those paths back to step-local expressions and shell variables so the workflow can instantiate normally while keeping the cache behavior intact. Constraint: GitHub Actions context availability is narrower in job-level env than in step inputs Rejected: Keep the job-level env shortcut | workflow never created any jobs on GitHub Confidence: high Scope-risk: narrow Reversibility: clean Directive: Run actionlint on workflow edits before pushing when using less-common context placements Tested: actionlint on kernel-validator-e2e.yml Not-tested: New GitHub run after this fix yet
The crash matcher was over-weighting generic reporting frames like print_report and dump_stack, which made real KASAN outputs look less similar than they are. Filter those generic frames out of the stack-frame comparison and treat invalid-free as an acceptable match for the current double-free bucket. This improves the matcher without changing the transport or VM runner behavior. Constraint: Real guest logs often include long KASAN/UBSAN reporting prologues before the interesting fault path Rejected: Keep scoring the first raw stack frames verbatim | biased the oracle toward reporting machinery instead of the faulting path Confidence: medium Scope-risk: narrow Reversibility: clean Directive: If we later split invalid-free into its own taxonomy bucket, revisit the double-free pattern alias here Tested: pnpm --filter @pwnkit/core test -- kernel-oracle.test.ts; pnpm build Not-tested: Improvement to real-world crashMatch rate beyond the exercised invalid-free sample
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Verification
Notes