Skip to content

Conversation

@shivasurya
Copy link
Owner

Summary

Implements enhanced SARIF formatter with code flows, related locations, and rich metadata for optimal GitHub Code Scanning integration.

Part of output-standardization tech spec (Stacked PRs)

Changes

New Files

  • output/sarif_formatter.go (290 lines)

    • SARIF 2.1.0 compliant output formatter
    • Code flows for taint path visualization (source → sink)
    • Related locations for taint sources
    • Help text with markdown and CWE references
    • Security severity scores (9.0, 7.0, 5.0, 3.0)
    • Rule properties: tags, precision
    • Deduplicates rules across multiple detections
  • output/sarif_formatter_test.go (519 lines)

    • Comprehensive tests achieving 97.5% coverage
    • Tests for version, tool metadata, rules, results
    • Code flow generation tests (taint-local, taint-global)
    • Related locations validation
    • Pattern vs taint detection differentiation

Modified Files

  • cmd/ci.go

    • Replaced old generateSARIFOutput() with new formatter
    • Uses enriched detections for rich output
    • Removed unused imports (sarif library, json, encoding/json)
    • Consistent pattern with JSON and CSV formatters
  • cmd/ci_test.go

    • Skipped obsolete SARIF tests
    • Removed unused helper functions

Key Features

Code Flows

Taint detections automatically include code flows showing the path from source to sink:

{
  "codeFlows": [{
    "message": {"text": "Taint flow from line 10 to line 20"},
    "threadFlows": [{
      "locations": [
        {
          "location": {"physicalLocation": {"region": {"startLine": 10}}},
          "message": {"text": "Taint source: user_input"}
        },
        {
          "location": {"physicalLocation": {"region": {"startLine": 20}}},
          "message": {"text": "Taint sink: os.system"}
        }
      ]
    }]
  }]
}

Help Text with Markdown

Rules include rich help text with CWE references:

## Command Injection

User input flows to shell command without sanitization

### References
- [CWE-78](https://cwe.mitre.org/data/definitions/78.html)

Security Severity Scores

GitHub-compatible severity scores for prioritization:

  • Critical: 9.0
  • High: 7.0
  • Medium: 5.0
  • Low: 3.0

Rule Properties

{
  "properties": {
    "tags": ["security"],
    "security-severity": "9.0",
    "precision": "high"
  }
}

Benefits over Old Implementation

Feature Old New
Code flows ❌ None ✅ Source → Sink visualization
Related locations ❌ None ✅ Taint sources highlighted
Help text ❌ Plain text ✅ Markdown with references
Security severity ❌ Level only ✅ Numeric scores for GitHub
Rule properties ❌ None ✅ Tags, precision
Pattern detection ❌ Same as taint ✅ No code flows (correct)
Test coverage ❌ ~60% ✅ 97.5%

Testing

  • All tests passing (97.5% coverage on SARIF formatter)
  • Output package overall: 97.5% coverage
  • Linting checks passed
  • Integration with ci command verified

Usage Examples

# Generate enhanced SARIF report with code flows
pathfinder ci --rules rules/ --project . --output sarif > results.sarif

# Upload to GitHub Code Scanning
gh api /repos/:owner/:repo/code-scanning/sarifs -F [email protected]

# View in GitHub UI with code flows highlighted

SARIF Output Sample

{
  "version": "2.1.0",
  "runs": [{
    "tool": {
      "driver": {
        "name": "Code Pathfinder",
        "version": "0.0.25",
        "rules": [{
          "id": "sql-injection",
          "name": "SQL Injection",
          "fullDescription": {"text": "Unsanitized user input flows to SQL query (CWE-89, A03:2021)"},
          "helpUri": "https://github.com/shivasurya/code-pathfinder",
          "defaultConfiguration": {"level": "error"},
          "properties": {
            "tags": ["security"],
            "security-severity": "9.0",
            "precision": "high"
          }
        }]
      }
    },
    "results": [{
      "ruleId": "sql-injection",
      "message": {"text": "Unsanitized user input flows to SQL query (sink: execute, confidence: 95%)"},
      "locations": [{
        "physicalLocation": {
          "artifactLocation": {"uri": "src/db/queries.py"},
          "region": {"startLine": 42, "startColumn": 8}
        }
      }],
      "codeFlows": [...],
      "relatedLocations": [...]
    }]
  }]
}

Breaking Changes

  • Old generateSARIFOutput() function removed
  • SARIF output structure enhanced with additional fields
  • Pattern matches no longer include code flows (correct behavior)

Stack Status

This PR stacks on:

Next PR:

🤖 Generated with Claude Code

@safedep
Copy link

safedep bot commented Nov 21, 2025

SafeDep Report Summary

Green Malicious Packages Badge Green Vulnerable Packages Badge Green Risky License Badge

No dependency changes detected. Nothing to scan.

This report is generated by SafeDep Github App

@codecov
Copy link

codecov bot commented Nov 21, 2025

Codecov Report

❌ Patch coverage is 89.15663% with 18 lines in your changes missing coverage. Please review.
✅ Project coverage is 79.76%. Comparing base (5e70827) to head (495743f).
⚠️ Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
sourcecode-parser/cmd/ci.go 0.00% 10 Missing ⚠️
sourcecode-parser/output/sarif_formatter.go 94.87% 4 Missing and 4 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #395      +/-   ##
==========================================
+ Coverage   79.64%   79.76%   +0.12%     
==========================================
  Files          77       78       +1     
  Lines        7689     7799     +110     
==========================================
+ Hits         6124     6221      +97     
- Misses       1322     1333      +11     
- Partials      243      245       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Owner Author

shivasurya commented Nov 22, 2025

Merge activity

  • Nov 22, 12:38 AM UTC: A user started a stack merge that includes this pull request via Graphite.
  • Nov 22, 12:43 AM UTC: Graphite rebased this pull request as part of a merge.
  • Nov 22, 12:44 AM UTC: @shivasurya merged this pull request with Graphite.

@shivasurya shivasurya changed the base branch from shiva/output-json-csv-formatters to graphite-base/395 November 22, 2025 00:41
@shivasurya shivasurya changed the base branch from graphite-base/395 to main November 22, 2025 00:42
shivasurya and others added 2 commits November 22, 2025 00:43
- Rich rule metadata with help markdown and CWE references
- Code flows for taint path visualization (source → sink)
- Related locations for taint sources
- Security severity scores for GitHub integration
- Rule properties including tags and precision
- Builder pattern API matching go-sarif library
- Comprehensive tests achieving 97.5% coverage

Features:
- SARIF 2.1.0 compliance
- Deduplicates rules across multiple detections
- Supports both taint-local and taint-global detection types
- Pattern matches do not include code flows (as expected)
- Fallback from RelPath to FilePath for artifact locations

Part of output standardization feature (PR #5).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
- Replaced old generateSARIFOutput() with output.SARIFFormatter
- Uses enriched detections for rich SARIF output with code flows
- Removed unused imports (sarif, json, encoding/json)
- Skipped obsolete SARIF tests (replaced by output/sarif_formatter_test.go)
- Cleaned up unused helper functions in tests

Benefits over old implementation:
- Code flows for taint path visualization
- Related locations for taint sources
- Help text with markdown formatting
- Security severity scores for GitHub
- Rule properties (tags, precision)
- Consistent with JSON and CSV formatter pattern

All tests passing, linting clean.

Part of output standardization feature (PR #5).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@shivasurya shivasurya force-pushed the shiva/output-sarif-formatter branch from 022584e to 495743f Compare November 22, 2025 00:43
@shivasurya shivasurya merged commit 40c3ebe into main Nov 22, 2025
3 checks passed
@shivasurya shivasurya deleted the shiva/output-sarif-formatter branch November 22, 2025 00:44
shivasurya added a commit that referenced this pull request Nov 22, 2025
## Summary

Implements standardized exit codes and the `--fail-on` flag for both `scan` and `ci` commands, enabling selective CI/CD pipeline failures based on security finding severities.

## Changes

### Core Exit Code Logic (`output/exit_code.go`)
- **Exit Code Constants**:
  - `ExitCodeSuccess (0)`: No findings or no --fail-on match
  - `ExitCodeFindings (1)`: Findings match --fail-on severities
  - `ExitCodeError (2)`: Configuration or execution errors
- **DetermineExitCode()**: Calculates appropriate exit code with error precedence
- **ParseFailOn()**: Parses comma-separated severity values
- **ValidateSeverities()**: Validates severity names (case-insensitive)
- **InvalidSeverityError**: Custom error type for validation failures

### Command Integration
- **scan command**: Add --fail-on flag and integrate exit code logic
- **ci command**: Add --fail-on flag and integrate exit code logic
- Both commands now:
  - Exit 0 by default (regardless of findings)
  - Exit 1 only when findings match --fail-on severities
  - Exit 2 on configuration/execution errors
  - Support case-insensitive severity validation

### Bug Fixes
- Fixed SARIF output always exiting 0 (now respects --fail-on)

## Testing

### Unit Tests (`output/exit_code_test.go`)
- 16 tests for `DetermineExitCode()` covering all exit scenarios
- 12 tests for `ParseFailOn()` covering edge cases
- 13 tests for `ValidateSeverities()` covering validation
- All tests verify case-insensitive behavior

### Integration Tests (`cmd/exit_code_integration_test.go`)
- Tests actual binary exit codes for both scan and ci commands
- Tests all output formats (SARIF, JSON, CSV)
- Tests invalid severity handling
- Tests case-insensitive severity matching
- Requires `INTEGRATION=1` and pre-built binary

**Test Results**: All tests passing ✅
```bash
$ gradle testGo
ok  	.../output	0.223s
ok  	.../cmd	0.317s
```

## Examples

```bash
# Default: no exit on findings
pathfinder scan --rules rules/ --project .
# Exit: 0 (even if vulnerabilities found)

# Fail on critical findings
pathfinder scan --rules rules/ --project . --fail-on critical
# Exit: 1 if critical findings, 0 otherwise

# Fail on critical or high findings
pathfinder ci --rules rules/ --project . --output sarif --fail-on critical,high
# Exit: 1 if critical/high findings, 0 otherwise

# Case insensitive
pathfinder scan --rules rules/ --project . --fail-on CRITICAL,High,MeDiUm
# Exit: 1 if any match

# Invalid severity
pathfinder scan --rules rules/ --project . --fail-on invalid
# Error: invalid severity 'invalid', must be one of: critical, high, medium, low, info
```

## Migration Notes

### Breaking Changes
- **Default behavior changed**: Previously, any findings caused exit 1. Now requires explicit `--fail-on` flag.
- **CI/CD pipelines**: Add `--fail-on critical,high` to maintain previous fail-on-findings behavior.

### Non-Breaking
- Existing commands without `--fail-on` continue to work (exit 0)
- All output formats work identically with exit codes

## Checklist

- [x] Core exit code logic implemented
- [x] Integrated with scan command
- [x] Integrated with ci command
- [x] Unit tests (95%+ coverage)
- [x] Integration tests
- [x] All tests passing
- [x] Linter passing
- [x] Binary builds successfully
- [x] Documentation in commit messages

## Stacked PRs

This PR stacks on top of:
- PR #5: SARIF Formatter (#395)

🤖 Generated with [Claude Code](https://claude.com/claude-code)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants