research: joint paper plan — agentic exploitation as a labeling oracle

## What

Tracking issue for the joint research paper between Doruk (pwnkit) and Guanni Qu (VulnBERT, Pebblebed Research Resident). Working title:

> **Agentic exploitation as a labeling oracle for vulnerability triage models**

## The thesis

The training data for a high-precision vulnerability triage classifier only exists if you have BOTH:

1. an agentic exploit harness running at scale (pwnkit's role) that produces \`(finding, attempted-PoC, real-or-fake)\` tuples
2. a labeled-classifier training pipeline (VulnBERT-class hybrid features-plus-embeddings architecture) that can consume those tuples and produce a small specialized model

Neither half exists meaningfully without the other. The dataset itself — not the model — is the moat. This is the inverse of the standard "dataset is given, model is the asset" framing in security ML.

## What's already shipped

- [x] \`triage-data-collector.ts\` produces JSONL with text + 45-feature vector + label + provenance (commit 5bc8503)
- [x] \`feature-extractor.ts\` 45-feature handcrafted vector explicitly inspired by VulnBERT's hybrid architecture (the file docstring says so)
- [x] npm-bench preserves raw findings so the data can be collected post-hoc (commit 5bc8503)
- [x] 18 vitest tests covering the collector and feature extraction (commit 5bc8503)

## What's still needed for the paper

- [ ] Grow npm-bench from 81 → ~250 packages so the labeled set is publication-scale
- [ ] First full dataset dump committed under \`packages/benchmark/results/triage-dataset-v1.jsonl\` (or shipped to Hugging Face)
- [ ] Baseline classifier (CodeBERT-scale or XGBoost on the 45-feature vector alone) trained and reported
- [ ] Hybrid classifier (handcrafted features + neural embeddings, VulnBERT architecture) trained and reported
- [ ] Joint pipeline: pwnkit raw → classifier → triaged finding, benchmarked against pwnkit alone on \`npm-bench\` (currently F1 0.444)
- [ ] arXiv preprint (cs.CR)
- [ ] Hugging Face: dataset card + model card
- [ ] Blog post on pwnkit.com / opensoar.app

## Honest negative results to include

- Handcrafted features designed for web-exploit findings (SQL errors, payload reflection, stack traces) are mostly zero on npm supply-chain findings — a publishable insight on domain transferability
- \`label_source: package_verdict\` is coarser than per-finding labels and produces some false-FPs on safe packages with legitimate findings — quantify the noise floor

## Sequencing

This issue intentionally has no deadline. The work is the moat and the research direction is the long game.

## Why this issue exists

So the work is in the open and not in someone's notebook. Discoverable, citable, mergeable.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

research: joint paper plan — agentic exploitation as a labeling oracle #67

What

The thesis

What's already shipped

What's still needed for the paper

Honest negative results to include

Sequencing

Why this issue exists

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

research: joint paper plan — agentic exploitation as a labeling oracle #67

Description

What

The thesis

What's already shipped

What's still needed for the paper

Honest negative results to include

Sequencing

Why this issue exists

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions