| Version | v4.0 |
| Date | 2026-03-30 |
| Author | PAN CHAO |
| Contact | u3638376@connect.hku.hk |
System Architecture | 系统架构文档
Full system architecture (diagrams, data flow, deployment) is maintained in: 完整的系统架构说明(含图示、数据流、部署视图)已单独成文:
Section 5 of this PRD contains only an architecture summary and index. 本文 PRD 第五节仅保留架构摘要与索引。
History | 版本历史
- v4.0: SSDLC + LangGraph. Full SSDLC lifecycle support (6 stages), LangChain/LangGraph as orchestration engine, stage-specific skills and assessment flows. Pivoted to full-phase support with phase-specific SSDLC agents. SSDLC + LangGraph。完整 SSDLC 生命周期支持(6 阶段),引入 LangChain/LangGraph 作为编排引擎,阶段专属 Skill 与评估流程。转向全阶段支持,配备阶段专用 SSDLC Agent。
- v3.1: Performance & quality. Graph RAG, Docling parser, async pipeline, parallel orchestration, guardrails, singleton KB, cached LLM. 性能与质量优化。Graph RAG、Docling 解析器、异步流水线、并行编排、输入防护、单例 KB、缓存 LLM。
- v3.0: Headless pivot. Removed Streamlit frontend; pure API + MCP service. 无前端化转型。移除 Streamlit 前端,纯 API + MCP 服务。
- v2.0: Major upgrade. Added Multi-Agent Orchestration, Human-in-the-Loop Workflow, Skill/Persona Management, and One-Click Deployment. 重大更新。新增多代理编排、人机协作流、技能/角色管理及一键部署。
- v1.4: PRD and System Architecture doc split. PRD 与系统架构文档分离。
- v1.3: Added "Security Requirements and Controls". 新增非业务性「安全需求与安全控制」。
- v1.2: KB multi-format upload & open-source parsing; Parser reuse. 知识库多格式上传与开源解析、Parser 复用。
- v1.1: Enterprise integration (ServiceNow), IAM (AAD/SSO, RBAC), Deployment. 企业集成(ServiceNow)、IAM(AAD/SSO、RBAC)、部署与连通性。
English
This PRD is for the open-source "DocSentinel" project. It defines business pain points, solution approach, system architecture, and product scope to serve as a single source of truth for subsequent design and development. The project aims to build an AI-powered SSDLC (Secure Software Development Lifecycle) platform that automates security activities across all six phases of the software development lifecycle — from requirements gathering to production operations. It automates the review of and recommendations for security-related documents, forms, and reports, reduces the burden on enterprise security teams, and supports integration with mainstream and local LLMs, multi-format file parsing, and extensible Skills and knowledge bases. Powered by LangChain and LangGraph for intelligent agent orchestration, it helps enterprise security teams embed security into every stage of delivery, not just the final review.
中文
本 PRD 面向「DocSentinel」开源项目,用于明确业务痛点、解决方案、系统架构与产品范围,为后续设计与开发提供统一依据。项目目标是构建一个 AI 驱动的 SSDLC(安全开发生命周期)平台,自动化覆盖软件开发生命周期全部六个阶段的安全活动——从需求收集到生产运维。通过 AI Agent 自动化完成安全评估相关文档/表格/报告的审阅与建议,减轻企业安全团队负担,并支持对接主流与本地大模型、多格式文件解析及可扩展的 Skill 与知识库。通过 LangChain 与 LangGraph 实现智能 Agent 编排,帮助企业安全团队将安全内嵌到交付的每一个环节,而非仅在最终审阅时介入。
English
Enterprise Cyber Security teams operate under multiple constraints:
- Diverse reference sources: Internal security policies, industry best practices (e.g. NIST SSDF, OWASP, CISA), past project cases, and compliance frameworks (e.g. SOC2, ISO 27001, PCI DSS).
- Full SSDLC coverage: Security review and control requirements exist at every stage — requirements/design, development, testing, deployment, and operations — but most tools only address one or two stages.
- Wide variety of deliverables: Security questionnaires, threat models, architecture documents, secure coding guidelines, SAST/DAST reports, penetration test findings, deployment checklists, compliance evidence, and audit materials all require manual reading, comparison, and sign-off.
- Shift-left pressure: Modern DevSecOps demands security involvement early in the lifecycle, but security teams lack tooling to scale across requirements, design, and development phases.
In agile and DevOps environments, enterprises ship dozens to hundreds of projects per year. Security teams must complete large volumes of assessments and reviews with limited headcount, creating a clear bottleneck — especially when coverage is expected across the entire SSDLC, not just pre-release reviews.
中文
大型企业的 Cyber Security 团队需要在以下多维度约束下工作:
- 依据来源多样:公司内部 Security Policy、行业最佳实践(如 NIST SSDF、OWASP、CISA 等)、历史项目案例与合规框架(如 SOC2、ISO 27001、PCI DSS)。
- 流程覆盖完整 SSDLC:从需求/设计、开发、测试、部署到运维,每个阶段都有安全评审与管控要求——但大多数工具只覆盖一两个阶段。
- 交付物类型繁多:安全问卷、威胁建模、架构文档、安全编码规范、SAST/DAST 报告、渗透测试结果、部署检查清单、合规证明、审计材料等,需人工阅读、比对与签字(Sign-off)。
- 左移压力:现代 DevSecOps 要求安全尽早介入生命周期,但安全团队缺乏在需求、设计、开发阶段规模化覆盖的工具支持。
在敏捷与 DevOps 环境下,企业每年上线项目数量从几十到几百不等,安全人员需要在有限人力下完成大量评估与审阅,成为明显瓶颈——尤其当覆盖范围从上线前审阅扩展到整个 SSDLC 时。
| Pain Point (English) | 痛点描述 (中文) |
|---|---|
| Fragmented SSDLC coverage Most tools cover only testing/deployment; requirements, design, and development phases lack automated security support. |
SSDLC 覆盖碎片化 大多数工具仅覆盖测试/部署阶段;需求、设计和开发阶段缺乏自动化安全支持。 |
| Fragmented assessment criteria Teams must align with policies, industry standards, and project precedents; manual lookup and alignment cost is high. |
评估依据分散 需同时参照 Policy、行业标准、项目案例;人工查找与对齐成本高。 |
| No unified threat modeling Threat models are created ad-hoc in design phase; no automated STRIDE/DREAD analysis or carry-forward to testing. |
威胁建模无统一支持 设计阶段威胁模型临时创建;无自动化 STRIDE/DREAD 分析,也无法延续至测试阶段。 |
| Heavy questionnaire workflow Multiple rounds of questionnaire filling, assessment, evidence collection, and review; inconsistent templates. |
问卷与证据流程繁重 问卷—评估—证据—审阅多轮往返;模板不统一、证据质量参差。 |
| Development-phase control relies on people Secure coding guidance, SAST result interpretation, policy definition, and exception approval still depend on security staff and are hard to scale. |
开发阶段管控依赖人工 安全编码指导、SAST 结果解读、策略制定、例外审批仍依赖安全人员,难以规模化。 |
| Pre-release review pressure Security must review every file and sign off; DAST/pentest reports need interpretation. Technical documents are hard for non-technical staff to interpret. |
上线前集中审阅压力大 需 Review 全部文件并 Sign-off;DAST/渗透测试报告需解读;技术文档阅读与理解成本高。 |
| Post-deployment blind spots Vulnerability monitoring, incident response, and patch tracking are disconnected from the development lifecycle. |
上线后盲区 漏洞监控、应急响应和补丁跟踪与开发生命周期脱节。 |
| Scale vs. consistency Manual assessment tends to be inconsistent, incomplete, or delayed; reusable patterns are hard to institutionalize. |
规模与一致性矛盾 人工评估易出现不一致、遗漏或延迟,且难以沉淀可复用的评估模式。 |
| SSDLC coverage gaps Security involvement is unevenly distributed across the SSDLC; requirements and design phases often get less scrutiny than pre-release review, leaving risks to accumulate. |
SSDLC 覆盖断层 安全介入在 SSDLC 各阶段分布不均;需求与设计阶段审查不足,风险层层积累到上线前集中爆发。 |
- Full lifecycle coverage / 全生命周期覆盖: Provide AI-assisted security support across all six SSDLC phases, not just testing and deployment.
- Automation / 自动化: Automate analysis and initial assessment of security artifacts at each phase — from requirements to operations.
- Consistency / 一致性: Produce consistent assessment conclusions and remediation recommendations based on a unified knowledge base and policies.
- Intelligence / 智能化: Use LangGraph-orchestrated agents to reason about cross-phase dependencies (e.g. a threat identified in design must be tested and monitored).
- Extensibility / 可扩展: Support custom SSDLC workflows, assessment scenarios, phase-specific skills, and different compliance frameworks and customer/project types.
- SSDLC coverage / 全生命周期覆盖: Provide stage-aware assessment across the entire SSDLC — requirements, design, development, testing, deployment, and operations — with stage-specific skills and checklists.
English
Build an AI-powered SSDLC platform for security teams, with the primary focus on automating security activities and assessment of all forms, documents, and reports across the entire secure software development lifecycle. After security staff submit project-related files to the Agent, the platform:
- Parses multi-format files: Convert Word, PDF, Excel, PPT, SAST/DAST reports, images, etc. into an intermediate format (e.g. JSON/Markdown).
- Uses knowledge base and policy: Rely on built-in or configurable compliance and policy knowledge to understand "what standards must be met."
- SSDLC-aware assessment: Automatically determine or accept the SSDLC stage and apply stage-specific assessment logic, checklists, and risk focus.
- Performs risk assessment and recommendations: Identify security/compliance risks and provide security advice and actionable remediation.
- Produces structured output: Enable security staff to quickly review, sign off, or hand off to business/development for remediation.
The platform covers six standard SSDLC phases with dedicated AI agents for each:
- Requirements Phase Agent: Analyze requirements documents to identify security requirements, compliance obligations (GDPR, PCI DSS, etc.), and perform initial risk analysis.
- Design Phase Agent: Review architecture/design documents, perform automated threat modeling (STRIDE/DREAD), evaluate security architecture, encryption schemes, and access control designs. Conduct Security Design Review (SDR).
- Development Phase Agent: Assess code against secure coding standards, review SAST findings, evaluate security controls (anti-injection, XSS prevention), and provide secure coding guidance.
- Testing Phase Agent: Analyze SAST/DAST scan reports, interpret penetration test results, prioritize vulnerability fixes, and verify remediation completeness.
- Deployment Phase Agent: Review deployment configurations, evaluate secret management, assess hardening measures, and perform pre-release security sign-off checks.
- Operations Phase Agent: Monitor vulnerability feeds, assist incident response, track patch management, and audit security logs.
The platform uses LangGraph to orchestrate these agents into configurable workflows — agents can run sequentially, in parallel, or conditionally based on project context. LangChain provides the unified LLM abstraction, tool integration, and RAG pipeline.
中文
构建一个面向安全团队的 AI 驱动 SSDLC 平台,首要方向为:自动化覆盖安全软件开发生命周期的全部安全活动,评估所有需要安全团队审阅的表格、文档与报告。安全人员将项目相关文件提交给 Agent 后,平台能够:
- 解析多格式文件:将 Word、PDF、Excel、PPT、SAST/DAST 报告、图片等转为可被模型理解的中间格式(如 JSON/Markdown)。
- 结合知识库与策略:基于内置/可配置的合规与策略知识库,理解「应该满足什么标准」。
- SSDLC 阶段感知评估:自动识别或接受 SSDLC 阶段,应用阶段专属评估逻辑、检查清单和风险关注点。
- 执行风险评估与建议:识别与安全/合规相关的风险点,给出安全建议与可操作的整改方案。
- 输出结构化结果:便于安全人员快速复核、签字或转交业务/开发团队整改。
平台为六个标准 SSDLC 阶段配备专用 AI Agent:
- 需求阶段 Agent:分析需求文档,识别安全需求、合规义务(GDPR、PCI DSS 等),执行初步风险分析。
- 设计阶段 Agent:审阅架构/设计文档,执行自动化威胁建模(STRIDE/DREAD),评估安全架构、加密方案、权限设计。执行安全设计评审(SDR)。
- 开发阶段 Agent:对照安全编码规范评估代码,审阅 SAST 发现,评估安全控件(防注入、XSS 防护),提供安全编码指导。
- 测试阶段 Agent:分析 SAST/DAST 扫描报告,解读渗透测试结果,确定漏洞修复优先级,验证整改完整性。
- 部署阶段 Agent:审阅部署配置,评估密钥管理,评估加固措施,执行上线前安全检查。
- 运维阶段 Agent:监控漏洞情报,辅助应急响应,跟踪补丁管理,审计安全日志。
平台使用 LangGraph 将这些 Agent 编排为可配置的工作流——Agent 可根据项目上下文顺序执行、并行执行或条件执行。LangChain 提供统一的 LLM 抽象、工具集成和 RAG 管道。
| Activity (English) | 活动 (中文) | Agent Capability |
|---|---|---|
| Define security requirements | 定义安全需求 | Extract security-relevant requirements from PRDs, user stories, BRDs |
| Identify compliance obligations | 识别合规要求 | Match requirements against GDPR, PCI DSS, SOC2, ISO 27001, etc. |
| Initial risk analysis | 初步风险分析 | Classify project risk level based on data sensitivity, exposure, and scope |
| Security requirements checklist | 安全需求清单 | Generate a checklist of security requirements that must be addressed |
| Activity (English) | 活动 (中文) | Agent Capability |
|---|---|---|
| Security architecture review | 安全架构评审 | Evaluate architecture documents for security patterns and anti-patterns |
| Threat modeling (STRIDE/DREAD) | 威胁建模 | Automated STRIDE analysis on design documents; DREAD risk scoring |
| Access control & encryption design | 权限设计与加密方案 | Review IAM design, data flow encryption, key management proposals |
| Security Design Review (SDR) | 安全设计评审 | Structured SDR report with findings and recommendations |
| Activity (English) | 活动 (中文) | Agent Capability |
|---|---|---|
| Secure coding standards assessment | 安全编码规范评估 | Check code/documents against OWASP Secure Coding Practices |
| SAST findings review | SAST 结果审阅 | Triage and interpret SAST tool output, reduce false positives |
| Built-in security controls | 内置安全控件 | Evaluate anti-injection, XSS prevention, CSRF protection implementations |
| Secure coding guidance | 安全编码指导 | Provide language-specific secure coding recommendations |
| Activity (English) | 活动 (中文) | Agent Capability |
|---|---|---|
| SAST report analysis | SAST 报告分析 | Parse and prioritize static analysis findings |
| DAST report analysis | DAST 报告分析 | Parse and interpret dynamic scan results |
| Penetration test review | 渗透测试审阅 | Analyze pentest reports, map findings to controls |
| Vulnerability fix verification | 漏洞修复验证 | Verify remediation evidence against original findings |
| Activity (English) | 活动 (中文) | Agent Capability |
|---|---|---|
| Pre-release security review | 上线前安全评审 | Checklist-based review of all phase outputs |
| Configuration security | 配置安全 | Review deployment configs, secrets management, least privilege |
| Security hardening assessment | 安全加固评估 | Evaluate server/container hardening against CIS benchmarks |
| Release sign-off | 发布签字 | Generate structured sign-off report with risk summary |
| Activity (English) | 活动 (中文) | Agent Capability |
|---|---|---|
| Vulnerability monitoring | 漏洞监控 | Analyze CVE feeds and vulnerability advisories against project stack |
| Incident response assistance | 应急响应辅助 | Provide structured incident analysis and response recommendations |
| Patch management tracking | 补丁管理跟踪 | Track vulnerability remediation progress and SLA compliance |
| Log audit analysis | 日志审计分析 | Analyze security logs for anomalies and compliance evidence |
- Full lifecycle / 全生命周期: Security coverage from day one (requirements) through production operations — not just pre-release review.
- Cost reduction / 降本: Reduce time security staff spend on repetitive document review across all SSDLC phases.
- Speed / 提速: Shorten the cycle time at each phase; enable parallel security review with development.
- Intelligence / 智能化: LangGraph-orchestrated agents maintain cross-phase context — a threat identified in design is automatically tracked through testing and deployment.
- Reproducibility / 可复现: Assessment logic and criteria are captured in the knowledge base, skills, and graph-based workflows.
- Openness / 开放: Support multiple commercial and local LLMs to meet requirements for data residency and cost control.
| Goal | Description |
|---|---|
| SSDLC full coverage SSDLC 全阶段覆盖 |
Provide AI-assisted security assessment across all 6 SSDLC phases with dedicated agents for each, with stage-specific skills, checklists, and flows. |
| Intelligent orchestration 智能编排 |
Use LangGraph to create configurable, stateful agent workflows that maintain context across SSDLC phases. |
| Automated assessment 自动化评估 |
Support automatic parsing and risk assessment of common formats: security questionnaires, design documents, SAST/DAST reports, pentest findings, deployment configs, compliance evidence, and audit reports. |
| Configurable scenarios 可配置评估场景 |
Use the knowledge base and Skills to configure different assessment criteria by compliance framework, SSDLC phase, project type, customer type, or risk level. |
| Multi-model support 多模型支持 |
Support mainstream commercial LLMs (e.g. ChatGPT, Qwen, Claude) and local/on-prem models (e.g. Ollama) through a unified LangChain interface. |
| Actionable results 结果可操作 |
Output risk items, compliance gaps, threat models, remediation suggestions, and sign-off reports with traceability across phases. |
- SSDLC Coverage: Number of SSDLC phases with active agent support (target: 6/6).
- Coverage: Number of supported document types (e.g. 8+ common formats) and knowledge base entries per phase.
- Efficiency: Average time from upload to report generation per phase; time saved vs. manual review.
- Cross-phase traceability: Percentage of findings that are tracked from identification to remediation across phases.
- Usability: Steps and time to complete one "upload → assess → review → sign-off" loop per phase.
- Extensibility: Configuration/development cost to add a new SSDLC phase workflow or assessment scenario.
Full Architecture Document
For detailed diagrams, data flow, deployment, and security architecture, see: 详细组件说明、Mermaid 架构图、数据流与时序图、集成视图、安全架构及部署视图见:
English
The system uses a layered design: Access (REST API / MCP Server / CLI) → SSDLC Orchestration (LangGraph state machine with phase-specific agents, SSDLC Pipeline, Memory, Skills) → Core Services (Knowledge Base RAG, Parser) → LLM Abstraction (LangChain) → Cloud/Local LLMs. The orchestrator is built on LangChain + LangGraph, enabling stateful, graph-based agent workflows with conditional branching per SSDLC stage. Optional integrations: AAD (identity/SSO), ServiceNow (project metadata), and SAST/DAST tools (scan results ingestion).
中文
系统采用分层设计:接入层(REST API / MCP Server / CLI)→ SSDLC 编排层(LangGraph 状态机与阶段专用 Agent、SSDLC 流水线、记忆体、Skill 层)→ 核心服务(知识库 RAG、文件解析)→ LLM 抽象层(LangChain)→ 商用/本地 LLM。编排引擎基于 LangChain + LangGraph 构建,支持有状态、图驱动的 Agent 工作流与 SSDLC 阶段条件分支。可选对接 AAD(身份/SSO)、ServiceNow(项目元数据)及 SAST/DAST 工具(扫描结果接入)。
High-Level Diagram | 架构图
┌─────────────────────────────────────────────────────────┐
│ User / Security Staff | 用户 / 安全人员 │
└───────────────────────────┬─────────────────────────────┘
│
┌───────────────────────────▼─────────────────────────────┐
│ Access Layer | 接入层 (API / MCP / CLI) │
└───────────────────────────┬─────────────────────────────┘
│
┌───────────────────────────────────────────▼───────────────────────────────────────────┐
│ SSDLC Orchestration (LangGraph) | SSDLC 编排层 │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Require- │ │ Design │ │ Dev │ │ Test │ │ Deploy │ │ Ops │ │
│ │ ments │ │ Agent │ │ Agent │ │ Agent │ │ Agent │ │ Agent │ │
│ │ Agent │ │ │ │ │ │ │ │ │ │ │ │
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ │
│ └─────────────┴────────────┴─────────────┴────────────┴────────────┘ │
│ │ │
│ ┌─────────────┐ ┌─────────────┐ │ ┌─────────────┐ ┌─────────────┐ │
│ │ Memory │ │ Skills │ │ │ KB (RAG) │ │ Parser │ │
│ │ 记忆体 │ │ Skill 层 │ │ │ 知识库 │ │ 文件解析 │ │
│ └─────────────┘ └─────────────┘ │ └─────────────┘ └─────────────┘ │
│ │ │
│ ┌───────────▼───────────┐ │
│ │ LLM Abstraction Layer │ │
│ │ (LangChain) │ │
│ └───────────┬───────────┘ │
└────────────────────────────────────┼───────────────────────────────────────────────────┘
│
┌───────────────────────────────┼───────────────────────────────────┐
│ Commercial/Cloud LLM │ Local/On-prem LLM │
│ ChatGPT / Claude / Qwen │ Ollama / vLLM / ... │
└───────────────────────────────────────────────────────────────────┘
| Component | Role | Details |
|---|---|---|
| SSDLC Orchestrator | LangGraph state machine coordinating phase agents with conditional routing and shared state. | ARCHITECTURE.md § Component Design |
| SSDLC Pipeline | Stage-aware routing (6 stages); selects stage-specific skills and checklists. | ARCHITECTURE.md § SSDLC Pipeline |
| Phase Agents | Six dedicated agents, each with phase-specific prompts, tools, and evaluation criteria. | ARCHITECTURE.md § SSDLC Agents |
| Memory | Manages working, episodic, cross-phase, and semantic state via LangGraph checkpointing. | ARCHITECTURE.md § Component Design |
| Skills | Reusable assessment capabilities (e.g. threat modeling, SAST triage, compliance check, SSDLC stage skills). | ARCHITECTURE.md § Component Design |
| Knowledge Base | Multi-format ingestion, chunking, embedding, hybrid RAG (vector + graph). | ARCHITECTURE.md § Component Design |
| Parser | Converts files (PDF, Word, Excel, SAST/DAST reports, etc.) to Markdown/JSON. | ARCHITECTURE.md § Component Design |
| LLM Abstraction | LangChain unified interface for model switching. | ARCHITECTURE.md § Component Design |
| Integrations | AAD (SSO), ServiceNow (metadata), SAST/DAST tool connectors. | ARCHITECTURE.md § Integration Points |
- User submits SSDLC assessment task (files + phase + optional SSDLC stage / skill ID / project/scenario) via API/MCP. API returns
task_idimmediately (non-blocking). - (Optional) Fetch project metadata from ServiceNow.
- Parser converts files to intermediate Markdown/text format (Docling or legacy).
- SSDLC Router determines the lifecycle stage (auto-detect or user-specified) and selects stage-specific skill + checklist.
- LangGraph Orchestrator routes to the appropriate Phase Agent(s). Policy+History and Evidence nodes run in parallel, followed by Drafter and Reviewer nodes.
- Phase Agent(s) load Knowledge Base chunks (RAG) and Skills, call LLM with context.
- Generate structured assessment report (risks, gaps, threat model, remediations, confidence, sources, SSDLC stage) with cross-phase traceability.
- Results stored for human-in-the-loop review and sign-off. User polls
GET /assessments/{task_id}to retrieve the completed report.
| Module | Feature | Priority |
|---|---|---|
| SSDLC Orchestrator | LangGraph-based state machine with 6 phase agents and conditional routing. | P0 |
| SSDLC Orchestrator | Cross-phase state management and finding traceability. | P0 |
| SSDLC Orchestrator | Configurable workflows: sequential, parallel, or selective phase execution. | P1 |
| Requirements Agent | Analyze requirements docs for security requirements and compliance obligations. | P0 |
| Design Agent | Automated threat modeling (STRIDE/DREAD) from architecture documents. | P0 |
| Design Agent | Security Design Review (SDR) report generation. | P0 |
| Design Agent | Threat modeling integration: support PyTM exports and Mermaid.js diagrams to help the agent “see” architecture, data flows, and trust boundaries. | P1 |
| Development Agent | Secure coding assessment against OWASP standards. | P0 |
| Development Agent | SAST findings triage and interpretation. | P1 |
| Testing Agent | SAST/DAST report parsing and vulnerability prioritization. | P0 |
| Testing Agent | Penetration test report analysis and remediation tracking. | P1 |
| Deployment Agent | Pre-release security checklist and configuration review. | P0 |
| Deployment Agent | CIS benchmark assessment for hardening. | P1 |
| Operations Agent | Vulnerability monitoring and CVE analysis against project stack. | P1 |
| Operations Agent | Incident response assistance and log audit. | P2 |
| SSDLC | Auto-detect stage from document content or accept explicit stage parameter. | P1 |
| Parser | Upload Word / PDF / Excel / PPT / SAST/DAST reports and convert to JSON/Markdown. | P0 |
| Parser | OCR / Vision support for images. | P1 |
| Parser | Ingest architecture diagrams as text inputs (e.g. Mermaid.js .mmd) for Design-stage reviews. |
P1 |
| Knowledge Base | Upload multi-format docs, parse, chunk, embed, and retrieve (RAG). | P0 |
| Knowledge Base | Metadata filtering (e.g. by framework, SSDLC phase, project, customer). | P1 |
| Knowledge Base | Phase-specific knowledge collections (requirements policies, design patterns, coding standards, etc.). | P0 |
| Knowledge Base | Graph RAG: Map relationships across internal policies and controls for deeper compliance insights. | P1 |
| Assessment | Select SSDLC phase and scenario, upload files, trigger assessment. | P0 |
| Assessment | Output structured report (Risks, Gaps, Threat Model, Remediation, Confidence). | P0 |
| Assessment | Human-in-the-Loop: Review, approve, reject, comment workflow. | P0 |
| Assessment | HITL feedback learning: allow auditors to correct findings and feed accepted corrections back into history/KB to reduce future false positives. | P1 |
| Assessment | Per-finding Confidence Scores + evidence links (page/paragraph citations) to speed up manual verification and benchmarking. | P1 |
| LLM | Configurable commercial LLMs (OpenAI, Claude, etc.) via LangChain. | P0 |
| LLM | Configurable local models (Ollama) via LangChain. | P0 |
| Skill | Skill/Persona Management: Create custom roles and import templates. | P0 |
| Skill | Built-in personas per SSDLC phase (e.g. Threat Modeler, Secure Code Reviewer, Pentest Analyst, SOC2 Auditor, AppSec Engineer). | P0 |
| Memory | LangGraph checkpointing for cross-phase state persistence. | P0 |
| Memory | History Reuse: Retrieve past similar assessments. | P1 |
| Access | REST API + MCP Server for agent integration. | P0 |
| Integrations | ServiceNow: Read project metadata. | P0 |
| Integrations | ServiceNow: Write back results / Webhook trigger. | P1 |
| Integrations | SAST/DAST tool connectors (SonarQube, Checkmarx, Burp, etc.). | P1 |
| Integrations | Automated remediation tracking: create and sync remediation items to Jira or GitHub Issues (ticket links in report). | P1 |
| IAM | AAD (Azure AD) Login & SSO. | P0 |
| IAM | RBAC (Analyst, Lead, Project Owner, Admin, API Consumer). | P0 |
| IAM | API Authentication (Bearer Token / API Key). | P0 |
| IAM | Data isolation by project/role. | P0 |
- As a security team member, I want to upload a project's requirements document (or a Security Questionnaire) and have the Requirements Agent automatically identify missing security requirements, compliance obligations, and gaps vs. policy/standards so that I can provide early feedback before design begins.
- As a security architect, I want to submit an architecture document to the Design Agent and receive an automated STRIDE threat model so that I can focus on reviewing and validating threats rather than creating the initial model from scratch.
- As a security lead, I want to run a full SSDLC assessment across multiple phases for a project (or select a project from ServiceNow) so that I get a unified view of security posture from requirements through deployment.
- As a developer, I want to submit my code review package and SAST results to the Development Agent via REST API so that I get prioritized findings with secure coding guidance specific to my language and framework, in JSON format for integration into ticketing workflows.
- As a pentest manager, I want to upload penetration test reports to the Testing Agent so that findings are automatically mapped to the original threat model and remediation is tracked.
- As an operations engineer, I want the Operations Agent to analyze new CVE feeds against our deployment stack and evaluate incident response logs so that I know which vulnerabilities require immediate patching and can identify process gaps.
- As enterprise IT, I want to configure the platform to use only a local Ollama model so that all assessment data stays within the internal network.
- As a DevSecOps engineer, I want to integrate the assessment API into our CI/CD pipeline so that security checks run automatically at each stage.
- As a project manager, I want the Agent to auto-detect the SSDLC stage from the uploaded document type so that I don't need to manually specify it every time.
The 6 standard SSDLC stages (aligned with NIST, OWASP, and Microsoft SDL):
| Stage | Name (EN) | 阶段名称 (CN) | Key Activities | Typical Inputs |
|---|---|---|---|---|
| 1 | Requirements | 需求阶段 | Define security requirements, compliance mapping (GDPR, ISO 27001, etc.), initial threat modeling, risk analysis | Requirements docs, compliance checklists, regulatory references |
| 2 | Design | 设计阶段 | Security architecture design, permission/access model, encryption scheme, threat modeling (STRIDE/DREAD), Security Design Review (SDR) | Architecture docs, design specs, threat models, data flow diagrams |
| 3 | Development | 开发阶段 | Secure coding standards compliance, security training verification, built-in security controls (anti-injection, XSS prevention, input validation) | Source code, coding guidelines, code review reports |
| 4 | Testing | 测试阶段 | SAST (static analysis), DAST (dynamic scanning), penetration testing, vulnerability fix & verification | SAST/DAST reports, pen-test findings, vulnerability scan results |
| 5 | Deployment | 部署阶段 | Security release readiness review, configuration security (key management, least privilege), hardening checklist | Deployment configs, infrastructure-as-code, release checklists |
| 6 | Operations | 运维阶段 | Vulnerability monitoring, incident response evaluation, patch management, log audit, ongoing compliance | Monitoring alerts, incident reports, audit logs, patch records |
Each stage maps to one or more built-in SSDLC Skills that define stage-specific system_prompt, risk_focus, compliance_frameworks, and assessment checklists. Users can also create custom SSDLC skills.
| Category | Requirement (English) | 要求 (中文) |
|---|---|---|
| Security & Privacy | Support fully local/on-prem deployment and local LLM; support audit logs. | 支持纯本地部署与本地 LLM;支持审计日志。 |
| Performance | Acceptable end-to-end latency for single-phase assessment; parallel phase execution for full SSDLC. | 单阶段评估时延可接受;全 SSDLC 评估支持并行执行。 |
| Maintainability | KB, Skills, LangGraph workflows, and LLM config maintainable via config/API without code changes. | 知识库、Skill、LangGraph 工作流、LLM 可配置,无需改代码扩展。 |
| Observability | Log model usage, tokens, duration, errors, and agent state transitions. | 记录模型、token、耗时、错误及 Agent 状态转换。 |
| Auth & Isolation | RBAC and data isolation by project/role; fine-grained auth via AAD/ServiceNow. | 按角色与项目隔离数据;细粒度授权。 |
| Deployment | Support on-prem/private deployment; connectivity to AAD/ServiceNow/LLM/SAST/DAST tools. | 支持内网部署;需连通 AAD/ServiceNow/LLM/SAST/DAST 工具。 |
| Open Source | Architecture aligns with mainstream open-source Agent projects (LangChain/LangGraph ecosystem). | 架构对齐 LangChain/LangGraph 生态,便于社区贡献。 |
This section defines security controls for the system itself (not the documents being assessed).
7.2.1 Control Domains | 控制域
- IAM: Identity and Access Control (身份与访问控制)
- DATA: Data Security (数据安全)
- APP: Application Security (应用安全)
- OPS: Operations and Audit (运维与审计)
- SCM: Supply Chain and Open Source (供应链与开源)
7.2.2 Identity and Access Control | 身份与访问控制
- IAM-01: All user/integration endpoints must require authentication (except health checks).
- IAM-02: Strong auth: AAD/OIDC SSO; API Bearer JWT or API Key (no secrets in URL).
- IAM-03: RBAC with least privilege default.
- IAM-04: Session/Token timeout and revocation.
- IAM-05: Sensitive operations (e.g. delete KB, modify workflows) require confirmation or higher privilege.
7.2.3 Data Security | 数据安全
- DATA-01: TLS (1.2+) for all transport.
- DATA-02: Encryption at rest for sensitive data; secrets management (no plaintext in code).
- DATA-03: Data minimization and retention policy.
- DATA-04: PII handling compliance (access control, audit).
- DATA-05: Clarify LLM data usage (cloud vs. local) for data sovereignty.
7.2.4 Application Security | 应用安全
- APP-01: Input validation (file type, size, path traversal).
- APP-02: Injection prevention (prompt injection mitigation, SQLi/Command injection).
- APP-03: Dependency scanning (SCA) and updates.
- APP-04: Safe error handling (no stack traces to users).
- APP-05: Web protections (CSRF, security headers, rate limiting).
7.2.5 Operations and Audit | 运维与审计
- OPS-01: Audit logs (who, what, when, resource) protected from tampering.
- OPS-02: Operational logs (performance, errors, agent state transitions) without sensitive content.
- OPS-03: Security event detection and alerting.
- OPS-04: Backup and recovery for critical data (KB, assessment history, LangGraph checkpoints).
7.2.6 Supply Chain | 供应链
- SCM-01: Trusted dependency sources.
- SCM-02: Vulnerability management process.
- SCM-03: License compliance.
| Component | Technology | Purpose |
|---|---|---|
| Workflow Engine | LangGraph | Stateful, graph-based agent orchestration with conditional routing, parallel execution, and checkpointing |
| LLM Framework | LangChain | Unified LLM abstraction, prompt management, tool integration, RAG chains |
| State Management | LangGraph Checkpointing | Cross-phase state persistence, conversation memory, assessment context |
| Component | Technology | Purpose |
|---|---|---|
| Language | Python 3.10+ | Primary development language |
| Web/API | FastAPI | Async REST API with auto OpenAPI |
| Vector DB | ChromaDB | Chunk-level similarity search |
| Graph RAG | LightRAG | Entity-relationship aware retrieval |
| Embeddings | sentence-transformers | Vector embeddings for RAG |
| Parsing | Docling (primary) + legacy fallback | Multi-format document parsing |
| LLM Providers | OpenAI, Ollama | Cloud and local LLM support |
- LangGraph Documentation: Reference for stateful agent orchestration, conditional routing, and multi-agent patterns.
- LangChain Documentation: Reference for LLM abstraction, RAG patterns, and tool integration.
- NIST SSDF (Secure Software Development Framework): Reference for SSDLC phase definitions and security activities.
- OWASP SAMM (Software Assurance Maturity Model): Reference for security practice areas across the SDLC.
- Microsoft SDL: Reference for security development lifecycle practices.
- STRIDE/DREAD: Reference for threat modeling methodology.
- LangGraph Integration: Implement LangGraph state machine with phase agent nodes, conditional edges, and shared state.
- Phase Agent MVP: Implement Requirements and Design phase agents first (highest Shift-Left value).
- Knowledge Base per Phase: Build phase-specific knowledge collections (requirements policies, design patterns, coding standards, testing guides, deployment checklists, operations playbooks).
- SAST/DAST Connectors: Build parsers for common tool output formats (SonarQube, Checkmarx, Burp Suite, OWASP ZAP).
- Cross-Phase Traceability: Implement finding linkage from threat model → test case → deployment check → monitoring rule.
- Enterprise Integration: Align with IT on AAD registration and ServiceNow API access.
- Pilot: Run with 1-2 teams across a full SSDLC cycle to gather feedback.
- Open Source: Release as "DocSentinel" after MVP stabilization.
- LangGraph Workflow Schema: How to define and persist custom SSDLC workflow configurations?
- Phase Agent Granularity: Should each phase have a single agent or multiple sub-agents (e.g. Design → Threat Modeler + Architecture Reviewer)?
- SAST/DAST Integration: Which tool output formats to support first? Standard SARIF format?
- Cross-Phase State: How much context to carry between phases? Full report or summarized findings?
- Report Schema: Concrete JSON schema for phase-specific and cross-phase findings?
- Skill Contract: Input/output for the first phase-specific Skills?
- KB Partitioning: Separate vector collections per SSDLC phase or unified with metadata filtering?
- Limits: File size, concurrency, rate limits per phase?
- Technology & Architecture:
docs/01-architecture-and-tech-stack.md - API Specification:
docs/02-api-specification.yaml - Report & Skill Contract:
docs/03-assessment-report-and-skill-contract.md - Integration Guide:
docs/04-integration-guide.md - Deployment Runbook:
docs/05-deployment-runbook.md - Agent Integration (MCP):
docs/06-agent-integration.md - SSDLC Workflow Guide:
docs/07-ssdlc-workflow-guide.md(new) - Security Implementation:
SECURITY.mdand secure coding guidelines.
End of Document