-
Notifications
You must be signed in to change notification settings - Fork 10
feat(docker): Complete Docker/Compose integration with runtime rule loading (PR 8/8) #426
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(docker): Complete Docker/Compose integration with runtime rule loading (PR 8/8) #426
Conversation
Follows Python DSL architecture pattern instead of bolt-on scanning:
1. **File Discovery** (graph/utils.go):
- Updated getFiles() to discover Dockerfile and docker-compose.yml files
- Pattern matching: "dockerfile*" and "*docker-compose.{yml,yaml}"
2. **Worker Pool Integration** (graph/initialize.go):
- Added Docker/Compose handling in 5-worker pool
- Routes files to parseDockerfile() and parseDockerCompose()
- No tree-sitter for these files, uses specialized parsers
3. **CodeGraph Node Creation** (graph/parser_docker.go):
- Dockerfile instructions → "dockerfile_instruction" nodes
- Compose services → "compose_service" nodes
- Unique IDs with line/column: "dockerfile:<file>:<type>:<line>:<col>"
- Arguments stored in MethodArgumentsValue for DSL queries
**Test Results**:
- ✅ Dockerfile parsed: 7 instruction nodes created
- ✅ docker-compose.yml parsed: 1 service node created
- ✅ Total: 8 nodes in unified CodeGraph
- ✅ Build successful, no compilation errors
**Architecture Benefits**:
- Single CodeGraph for all source files (Java/Python/Docker/Compose)
- Consistent worker pool pattern across all file types
- Unified query execution (DSL rules query same graph)
- Proper Node IDs prevent duplication
Closes #8 (Docker/Compose proper integration)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Added test suite for parser_docker.go with 90%+ coverage: **Test Coverage**: - ✅ parseDockerfile() - success and error cases - ✅ parseDockerCompose() - success and error cases - ✅ convertDockerInstructionToNode() - all instruction types - ✅ extractDockerInstructionArgs() - FROM, USER, EXPOSE, ENV - ✅ convertComposeServiceToNode() - service properties - ✅ extractComposeServiceProperties() - security properties - ✅ Helper functions - IsDockerNode, GetDockerInstructionType, etc. - ✅ Initialize() integration - both file types in worker pool **Linting Fixes**: - Rewrote if-else to switch in utils.go (gocritic) - Added period to comment (godot) - Removed unused parameters (unparam) **Test Results**: - 91.5% coverage for graph package ✅ - All tests passing ✅ - Linter clean ✅ - Build successful ✅ Stacked on: PR #8 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
- Extract Docker/Compose file paths from CodeGraph after parsing - Load compiled container rules from python-dsl/compiled_rules.json - Execute ContainerRuleExecutor on DockerfileGraph and ComposeGraph - Convert RuleMatch results to EnrichedDetection format - Merge container findings with code analysis findings Tested with /tmp/docker-test-project: - Detected 7 security issues (4 Dockerfile + 3 Compose) - Container scan executes in <1ms after graph building - Unified output pipeline working Resolves the missing container rule execution piece identified in previous pipeline analysis.
…r findings
Two critical fixes for container scanning:
1. **Filter container matchers in Python DSL loader** (dsl/loader.go)
- Added cases for container matcher types to ExecuteRule switch
- Returns empty array for: missing_instruction, instruction,
service_has, service_missing, any_of, all_of, none_of
- Prevents 'unknown matcher type' errors since these are handled
by ContainerRuleExecutor
2. **Generate code snippets for container detections** (cmd/scan.go)
- Added generateCodeSnippet() to read file and extract context lines
- Added splitLines() helper for proper line parsing
- Normalize severity to lowercase for text formatter compatibility
- Findings now display with full code context
Tested with /tmp/docker-test-project:
- CRITICAL/HIGH findings show detailed output with code snippets
- MEDIUM/LOW findings show abbreviated single-line output
- 7 total findings correctly formatted and displayed
Example output:
[critical] COMPOSE-SEC-001: Service Running in Privileged Mode
CWE-250
docker-compose.yml:1
> 1 | version: '3.8'
2 | services:
3 | web:
SafeDep Report SummaryNo dependency changes detected. Nothing to scan. This report is generated by SafeDep Github App |
Remove hardcoded compiled_rules.json dependency and implement proper
runtime compilation of container rules from the --rules path, matching
the architecture of code analysis rules.
Changes:
- Add LoadContainerRules() to RuleLoader for runtime compilation
- Generate Python script that dynamically imports and compiles rules
- Implement sandbox mode support with temp file creation for nsjail
- Handle mixed directories containing both code and container rules
- Suppress warnings for skipped container rule files
- Fix lint: nilerr violations, dupBranchBody, prealloc
Implementation:
- loadContainerRulesFromFile() generates Python script at runtime that:
* Uses importlib to dynamically import rule files from --rules path
* Calls container_ir.compile_all_rules() to aggregate decorator-registered rules
* Returns complete JSON IR: {"dockerfile": [...], "compose": [...]}
- Sandbox mode: writes script to temp file, executes with nsjail, cleans up
- Direct mode: executes script with python3 -c for development
Testing:
✅ All 7 findings detected (2 critical, 1 high, 1 medium, 3 low)
✅ Runtime compilation working for both single files and directories
✅ Sandbox mode properly implemented with temp file handling
✅ No hardcoded paths - production ready
✅ Lint passing with 0 issues
Resolves requirement: "its always the rule directory or file passed
to the command it generates IR in runtime"
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
… severities Three critical fixes for container scanning output: 1. Fix line numbers in all_of combinator rules - evaluateAllOf was hardcoding LineNumber: 1 - Now captures first match to preserve actual instruction line number - Example: apt-get rule now shows "Dockerfile:2" instead of "Dockerfile:1" 2. Fix rule count in summary - Was only counting code analysis rules (len(rules)) - Now counts unique rule IDs from all detections - Correctly shows "7 findings across 7 rules" instead of "across 0 rules" 3. Show code snippets for all severities - Was only showing detailed output for critical/high - Now shows code snippets for critical, high, medium, and low - Only info level gets abbreviated (single line) output Testing: ✅ Dockerfile:1 for FROM instruction (using :latest) ✅ Dockerfile:2 for RUN instruction (apt without --no-install-recommends) ✅ Correct rule counts (3 rules, 7 rules with compose) ✅ All severities show detailed findings with code context ✅ Lint passing (0 issues) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Comprehensive fix for accurate line number reporting in both Dockerfile and docker-compose findings. YAML Parser Changes: - Add LineNumber field to YAMLNode structure - Switch from yaml.Unmarshal to yaml.Node to preserve line info - Implement convertYAMLNodeToInternal() to extract line numbers - Decode scalar values to proper types (bool, int, float, string) - Fix: "privileged: true" now decoded as bool, not string ComposeGraph Changes: - Add ServiceGetLineNumber(serviceName, key) method - Returns line number of specific property or service declaration - Supports nested property lookups (e.g., volumes, network_mode) Container Executor Changes: - Remove all hardcoded LineNumber: 1 assignments - evaluateServiceHas: use ServiceGetLineNumber for all matchers - evaluateServiceMissing: point to service declaration line - evaluateAllOf: return first match with correct line number - evaluateNoneOf: use violated match's line number - Fix contains_any matcher to use property line number Testing Results: ✅ COMPOSE-SEC-001 (privileged) → line 5 (was 1) ✅ COMPOSE-SEC-002 (docker socket) → line 10 (was 1) ✅ COMPOSE-SEC-007 (host network) → line 6 (was 1) ✅ DOCKER-BP-005 (apt) → line 2 (all_of matcher fixed) ✅ All 7 findings show correct line numbers ✅ Code snippets display proper context ✅ Lint passing (0 issues) Fixes privileged mode detection by properly decoding YAML bool values instead of treating them as strings. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Add test coverage for helper functions used in container scanning integration, improving overall coverage from 84.5% to 85.2%. New Tests: - TestExtractContainerFiles: Tests Dockerfile and compose file extraction * Handles multiple files of each type * Properly deduplicates files * Returns empty arrays when no container files present - TestSplitLines: Tests line splitting utility * Handles Unix (\n) and Windows (\r\n) line endings * Preserves empty lines * Handles files without trailing newlines * Handles empty content - TestGenerateCodeSnippet: Tests code snippet generation * Generates snippets with context lines * Handles edge cases (start/end of file) * Handles invalid line numbers * Handles nonexistent files Coverage Impact: - Overall: 84.5% → 85.2% (+0.7%) - cmd/scan.go: Improved coverage for container-related functions - All new tests passing (15 test cases, 0 failures) These tests ensure robustness of container scanning features including proper file detection, accurate line number reporting, and correct code snippet generation for security findings. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## docker/07-integration-rule-library #426 +/- ##
======================================================================
- Coverage 81.74% 80.94% -0.80%
======================================================================
Files 84 85 +1
Lines 8649 9132 +483
======================================================================
+ Hits 7070 7392 +322
- Misses 1310 1451 +141
- Partials 269 289 +20 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Add comprehensive tests for runtime container rule compilation: - LoadContainerRules: single file and directory loading - isSandboxEnabled: environment variable handling - ServiceGetLineNumber: line number tracking for compose properties - convertYAMLNodeToInternal: YAML node conversion with line numbers Coverage improvements: - dsl/loader.go: LoadContainerRules 3.03% → 83.3% - graph/parser_compose.go: ServiceGetLineNumber 0% → 88.9% - graph/parser_yaml.go: convertYAMLNodeToInternal 72% → 86.2% - Overall: dsl package 86.3%, graph package 91.3% 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
The LoadContainerRules tests were failing in CI because they created test rule files with if __name__ == "__main__" blocks that tried to compile rules themselves using container_ir.compile_all_rules(). This is incorrect - the loader's wrapper script (in loadContainerRulesFromFile) handles compilation. Rule files should only define decorated functions. Fixed by removing the if __name__ blocks from test rule files to match the structure of actual rule files (like last_user_is_root.py). This fixes PR #426 CI failure.
🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Fixes test failure caused by version mismatch between PyPI (1.0.0) and local development version (1.1.1). The test TestRuleLoader_LoadContainerRules was failing because CI was installing codepathfinder 1.0.0 from PyPI, but the local codebase is at 1.1.1 with fixes already committed (removal of if __name__ blocks). This change ensures CI uses the local python-dsl package, eliminating the need to publish to PyPI before merging container rule changes. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
b733803
into
docker/07-integration-rule-library
…oading (PR 8/8) (#426) * feat: Integrate Docker/Compose parsing into unified CodeGraph Follows Python DSL architecture pattern instead of bolt-on scanning: 1. **File Discovery** (graph/utils.go): - Updated getFiles() to discover Dockerfile and docker-compose.yml files - Pattern matching: "dockerfile*" and "*docker-compose.{yml,yaml}" 2. **Worker Pool Integration** (graph/initialize.go): - Added Docker/Compose handling in 5-worker pool - Routes files to parseDockerfile() and parseDockerCompose() - No tree-sitter for these files, uses specialized parsers 3. **CodeGraph Node Creation** (graph/parser_docker.go): - Dockerfile instructions → "dockerfile_instruction" nodes - Compose services → "compose_service" nodes - Unique IDs with line/column: "dockerfile:<file>:<type>:<line>:<col>" - Arguments stored in MethodArgumentsValue for DSL queries **Test Results**: - ✅ Dockerfile parsed: 7 instruction nodes created - ✅ docker-compose.yml parsed: 1 service node created - ✅ Total: 8 nodes in unified CodeGraph - ✅ Build successful, no compilation errors **Architecture Benefits**: - Single CodeGraph for all source files (Java/Python/Docker/Compose) - Consistent worker pool pattern across all file types - Unified query execution (DSL rules query same graph) - Proper Node IDs prevent duplication Closes #8 (Docker/Compose proper integration) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <[email protected]> * test: Add comprehensive tests for Docker/Compose parser (91.5% coverage) Added test suite for parser_docker.go with 90%+ coverage: **Test Coverage**: - ✅ parseDockerfile() - success and error cases - ✅ parseDockerCompose() - success and error cases - ✅ convertDockerInstructionToNode() - all instruction types - ✅ extractDockerInstructionArgs() - FROM, USER, EXPOSE, ENV - ✅ convertComposeServiceToNode() - service properties - ✅ extractComposeServiceProperties() - security properties - ✅ Helper functions - IsDockerNode, GetDockerInstructionType, etc. - ✅ Initialize() integration - both file types in worker pool **Linting Fixes**: - Rewrote if-else to switch in utils.go (gocritic) - Added period to comment (godot) - Removed unused parameters (unparam) **Test Results**: - 91.5% coverage for graph package ✅ - All tests passing ✅ - Linter clean ✅ - Build successful ✅ Stacked on: PR #8 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <[email protected]> * feat: integrate ContainerRuleExecutor into scan command - Extract Docker/Compose file paths from CodeGraph after parsing - Load compiled container rules from python-dsl/compiled_rules.json - Execute ContainerRuleExecutor on DockerfileGraph and ComposeGraph - Convert RuleMatch results to EnrichedDetection format - Merge container findings with code analysis findings Tested with /tmp/docker-test-project: - Detected 7 security issues (4 Dockerfile + 3 Compose) - Container scan executes in <1ms after graph building - Unified output pipeline working Resolves the missing container rule execution piece identified in previous pipeline analysis. * fix: resolve IR matcher errors and add output formatting for container findings Two critical fixes for container scanning: 1. **Filter container matchers in Python DSL loader** (dsl/loader.go) - Added cases for container matcher types to ExecuteRule switch - Returns empty array for: missing_instruction, instruction, service_has, service_missing, any_of, all_of, none_of - Prevents 'unknown matcher type' errors since these are handled by ContainerRuleExecutor 2. **Generate code snippets for container detections** (cmd/scan.go) - Added generateCodeSnippet() to read file and extract context lines - Added splitLines() helper for proper line parsing - Normalize severity to lowercase for text formatter compatibility - Findings now display with full code context Tested with /tmp/docker-test-project: - CRITICAL/HIGH findings show detailed output with code snippets - MEDIUM/LOW findings show abbreviated single-line output - 7 total findings correctly formatted and displayed Example output: [critical] COMPOSE-SEC-001: Service Running in Privileged Mode CWE-250 docker-compose.yml:1 > 1 | version: '3.8' 2 | services: 3 | web: * fix: implement runtime container rule loading from --rules path Remove hardcoded compiled_rules.json dependency and implement proper runtime compilation of container rules from the --rules path, matching the architecture of code analysis rules. Changes: - Add LoadContainerRules() to RuleLoader for runtime compilation - Generate Python script that dynamically imports and compiles rules - Implement sandbox mode support with temp file creation for nsjail - Handle mixed directories containing both code and container rules - Suppress warnings for skipped container rule files - Fix lint: nilerr violations, dupBranchBody, prealloc Implementation: - loadContainerRulesFromFile() generates Python script at runtime that: * Uses importlib to dynamically import rule files from --rules path * Calls container_ir.compile_all_rules() to aggregate decorator-registered rules * Returns complete JSON IR: {"dockerfile": [...], "compose": [...]} - Sandbox mode: writes script to temp file, executes with nsjail, cleans up - Direct mode: executes script with python3 -c for development Testing: ✅ All 7 findings detected (2 critical, 1 high, 1 medium, 3 low) ✅ Runtime compilation working for both single files and directories ✅ Sandbox mode properly implemented with temp file handling ✅ No hardcoded paths - production ready ✅ Lint passing with 0 issues Resolves requirement: "its always the rule directory or file passed to the command it generates IR in runtime" 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <[email protected]> * fix: correct line numbers, rule count, and show code snippets for all severities Three critical fixes for container scanning output: 1. Fix line numbers in all_of combinator rules - evaluateAllOf was hardcoding LineNumber: 1 - Now captures first match to preserve actual instruction line number - Example: apt-get rule now shows "Dockerfile:2" instead of "Dockerfile:1" 2. Fix rule count in summary - Was only counting code analysis rules (len(rules)) - Now counts unique rule IDs from all detections - Correctly shows "7 findings across 7 rules" instead of "across 0 rules" 3. Show code snippets for all severities - Was only showing detailed output for critical/high - Now shows code snippets for critical, high, medium, and low - Only info level gets abbreviated (single line) output Testing: ✅ Dockerfile:1 for FROM instruction (using :latest) ✅ Dockerfile:2 for RUN instruction (apt without --no-install-recommends) ✅ Correct rule counts (3 rules, 7 rules with compose) ✅ All severities show detailed findings with code context ✅ Lint passing (0 issues) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <[email protected]> * fix: implement proper line number tracking for all container rules Comprehensive fix for accurate line number reporting in both Dockerfile and docker-compose findings. YAML Parser Changes: - Add LineNumber field to YAMLNode structure - Switch from yaml.Unmarshal to yaml.Node to preserve line info - Implement convertYAMLNodeToInternal() to extract line numbers - Decode scalar values to proper types (bool, int, float, string) - Fix: "privileged: true" now decoded as bool, not string ComposeGraph Changes: - Add ServiceGetLineNumber(serviceName, key) method - Returns line number of specific property or service declaration - Supports nested property lookups (e.g., volumes, network_mode) Container Executor Changes: - Remove all hardcoded LineNumber: 1 assignments - evaluateServiceHas: use ServiceGetLineNumber for all matchers - evaluateServiceMissing: point to service declaration line - evaluateAllOf: return first match with correct line number - evaluateNoneOf: use violated match's line number - Fix contains_any matcher to use property line number Testing Results: ✅ COMPOSE-SEC-001 (privileged) → line 5 (was 1) ✅ COMPOSE-SEC-002 (docker socket) → line 10 (was 1) ✅ COMPOSE-SEC-007 (host network) → line 6 (was 1) ✅ DOCKER-BP-005 (apt) → line 2 (all_of matcher fixed) ✅ All 7 findings show correct line numbers ✅ Code snippets display proper context ✅ Lint passing (0 issues) Fixes privileged mode detection by properly decoding YAML bool values instead of treating them as strings. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <[email protected]> * test: add comprehensive tests for container scanning helper functions Add test coverage for helper functions used in container scanning integration, improving overall coverage from 84.5% to 85.2%. New Tests: - TestExtractContainerFiles: Tests Dockerfile and compose file extraction * Handles multiple files of each type * Properly deduplicates files * Returns empty arrays when no container files present - TestSplitLines: Tests line splitting utility * Handles Unix (\n) and Windows (\r\n) line endings * Preserves empty lines * Handles files without trailing newlines * Handles empty content - TestGenerateCodeSnippet: Tests code snippet generation * Generates snippets with context lines * Handles edge cases (start/end of file) * Handles invalid line numbers * Handles nonexistent files Coverage Impact: - Overall: 84.5% → 85.2% (+0.7%) - cmd/scan.go: Improved coverage for container-related functions - All new tests passing (15 test cases, 0 failures) These tests ensure robustness of container scanning features including proper file detection, accurate line number reporting, and correct code snippet generation for security findings. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <[email protected]> * chore: add coverage.out to .gitignore * test: improve coverage for container rule loading Add comprehensive tests for runtime container rule compilation: - LoadContainerRules: single file and directory loading - isSandboxEnabled: environment variable handling - ServiceGetLineNumber: line number tracking for compose properties - convertYAMLNodeToInternal: YAML node conversion with line numbers Coverage improvements: - dsl/loader.go: LoadContainerRules 3.03% → 83.3% - graph/parser_compose.go: ServiceGetLineNumber 0% → 88.9% - graph/parser_yaml.go: convertYAMLNodeToInternal 72% → 86.2% - Overall: dsl package 86.3%, graph package 91.3% 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <[email protected]> * fix(dsl): Remove if __name__ blocks from container rule tests The LoadContainerRules tests were failing in CI because they created test rule files with if __name__ == "__main__" blocks that tried to compile rules themselves using container_ir.compile_all_rules(). This is incorrect - the loader's wrapper script (in loadContainerRulesFromFile) handles compilation. Rule files should only define decorated functions. Fixed by removing the if __name__ blocks from test rule files to match the structure of actual rule files (like last_user_is_root.py). This fixes PR #426 CI failure. * chore(python-dsl): Bump version to 1.1.1 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <[email protected]> * ci: use local python-dsl instead of PyPI version Fixes test failure caused by version mismatch between PyPI (1.0.0) and local development version (1.1.1). The test TestRuleLoader_LoadContainerRules was failing because CI was installing codepathfinder 1.0.0 from PyPI, but the local codebase is at 1.1.1 with fixes already committed (removal of if __name__ blocks). This change ensures CI uses the local python-dsl package, eliminating the need to publish to PyPI before merging container rule changes. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <[email protected]> --------- Co-authored-by: Claude Sonnet 4.5 <[email protected]>

Complete Docker/Compose Security Scanning Integration
Final PR in the Docker/Compose integration series (8/8)
This PR completes the Docker and docker-compose security scanning integration by connecting the ContainerRuleExecutor to the main scan command with proper runtime rule loading, accurate line number tracking, and comprehensive output formatting.
What's Included
1. Container Scanning Integration (commit ff7c5c5, 3e30a95)
dockerfile_instructionandcompose_servicenodes2. Runtime Rule Loading (commit 2da94fa) ✨ Key Feature
compiled_rules.jsondependencyLoadContainerRules()for dynamic compilation from--rulespathcontainer_ir--rulesflag3. Output Formatting & Quality (commits 3f97a9e, 6721e1e)
all_ofcombinator line number preservation4. Accurate Line Number Tracking (commit d9ce7e1) 🎯
YAML Parser Enhancements:
LineNumberfield toYAMLNodestructureyaml.Unmarshaltoyaml.Nodeto preserve source positionsComposeGraph API:
ServiceGetLineNumber(serviceName, key)Container Executor:
LineNumber: 1assignmentsservice_has,service_missing,contains_any,all_of,none_of) use actual source line numbers5. Test Coverage (commit 3441fde)
Results
All 7 findings now show correct line numbers:
Example Output
Technical Architecture
Key Improvements Over Previous Approach
compiled_rules.json--rulespathQuality Metrics
Dependencies
Breaking Changes
None - This is additive functionality.
Follow-up Work
Future enhancements (not in this PR):
Supersedes: PR #423 (old approach with compiled_rules.json)
Part of: Docker/Compose security scanning integration (PRs #416-#422, #426)