diff --git a/README.md b/README.md index aab6388b..16bacc56 100644 --- a/README.md +++ b/README.md @@ -94,30 +94,144 @@ Read the [official documentation](https://codepathfinder.dev/), or run `pathfind ## Usage +### Scan Command (Interactive) + +```bash +# Basic scan +pathfinder scan --rules rules/ --project /path/to/project + +# With verbose output +pathfinder scan --rules rules/ --project . --verbose + +# With debug output +pathfinder scan --rules rules/ --project . --debug + +# Fail on specific severities +pathfinder scan --rules rules/ --project . --fail-on=critical,high +``` + +### CI Command (Machine-Readable) + +```bash +# JSON output +pathfinder ci --rules rules/ --project . --output json > results.json + +# CSV output +pathfinder ci --rules rules/ --project . --output csv > results.csv + +# SARIF output (GitHub Code Scanning) +pathfinder ci --rules rules/ --project . --output sarif > results.sarif + +# With failure control +pathfinder ci --rules rules/ --project . --output json --fail-on=critical +``` + +## Output Formats + +### Text Output (Default for scan) + +``` +Code Pathfinder Security Scan + +Results: + +Critical Issues (1): + + [critical] [Taint-Local] command-injection: Command Injection + CWE-78 | A1:2017 + + auth/login.py:127 + > 125 | user_input = request.form['username'] + 126 | # Process input + > 127 | os.system(f"echo {user_input}") + + Flow: user_input (line 125) -> os.system (line 127) + Confidence: High | Detection: Intra-procedural taint analysis + +Summary: + 1 findings across 10 rules + 1 critical +``` + +### JSON Output + +```json +{ + "tool": { + "name": "Code Pathfinder", + "version": "0.0.25" + }, + "scan": { + "target": "/path/to/project", + "rules_executed": 10 + }, + "results": [ + { + "rule_id": "command-injection", + "severity": "critical", + "location": { + "file": "auth/login.py", + "line": 127 + }, + "detection": { + "type": "taint-local", + "source": {"line": 125, "variable": "user_input"}, + "sink": {"line": 127, "call": "os.system"} + } + } + ], + "summary": { + "total": 1, + "by_severity": {"critical": 1} + } +} +``` + +### CSV Output + +```csv +severity,confidence,rule_id,rule_name,cwe,owasp,file,line,column,function,message,detection_type,detection_scope,source_line,sink_line,tainted_var,sink_call +critical,high,command-injection,Command Injection,CWE-78,A1:2017,auth/login.py,127,8,login,User input flows to shell,taint-local,local,125,127,user_input,os.system +``` + +### SARIF Output + +SARIF 2.1.0 compatible output for GitHub Code Scanning integration. + ```bash -$ cd sourcecode-parser - -$ gradle buildGo (or) npm install -g codepathfinder - -$ ./pathfinder query --project --stdin -2024/06/30 21:35:29 Graph built successfully -Path-Finder Query Console: ->FROM method_declaration AS md - WHERE md.getName() == "getPaneChanges" - SELECT md, "query for pane changes layout methods" -Executing query: FROM method_declaration AS md WHERE md.getName() == "getPaneChanges" - -┌───┬──────────────────────────────────────────┬─────────────┬────────────────────┬────────────────┬──────────────────────────────────────────────────────────────┐ -│ # │ FILE │ LINE NUMBER │ TYPE │ NAME │ CODE SNIPPET │ -├───┼──────────────────────────────────────────┼─────────────┼────────────────────┼────────────────┼──────────────────────────────────────────────────────────────┤ -│ 1 │ /Users/shiva/src/code-pathfinder/test-sr │ 148 │ method_declaration │ getPaneChanges │ protected void getPaneChanges() throws ClassCastException { │ -│ │ c/android/app/src/main/java/com/ivb/udac │ │ │ │ mTwoPane = findViewById(R.id.movie_detail_container) │ -│ │ ity/movieListActivity.java │ │ │ │ != null; │ -│ │ │ │ │ │ } │ -└───┴──────────────────────────────────────────┴─────────────┴────────────────────┴────────────────┴──────────────────────────────────────────────────────────────┘ -Path-Finder Query Console: ->:quit -Okay, Bye! +# Upload to GitHub Code Scanning +gh api /repos/:owner/:repo/code-scanning/sarifs -F sarif=@results.sarif +``` + +## Verbosity Levels + +| Flag | Output | +|------|--------| +| (default) | Clean results only | +| `--verbose` | Results + progress + statistics | +| `--debug` | All output + timestamps | + +## Exit Codes + +| Code | Meaning | +|------|---------| +| 0 | Success (no findings, or findings without --fail-on match) | +| 1 | Findings match --fail-on severities | +| 2 | Configuration or execution error | + +### Examples + +```bash +# Default: always exit 0 +pathfinder scan --rules rules/ --project . +echo $? # 0 even with findings + +# Fail on critical or high +pathfinder scan --rules rules/ --project . --fail-on=critical,high +echo $? # 1 if critical/high found, 0 otherwise + +# Fail on any finding +pathfinder scan --rules rules/ --project . --fail-on=critical,high,medium,low ``` ## Acknowledgements diff --git a/sourcecode-parser/cli.md b/sourcecode-parser/cli.md new file mode 100644 index 00000000..4821094a --- /dev/null +++ b/sourcecode-parser/cli.md @@ -0,0 +1,163 @@ +# CLI Reference + +## Commands + +### scan + +Interactive security scanning with human-readable output. + +**Usage**: +```bash +pathfinder scan --rules --project [flags] +``` + +**Required Flags**: +- `--rules, -r` - Path to rules file or directory +- `--project, -p` - Path to project to scan + +**Optional Flags**: +- `--verbose, -v` - Show progress and statistics +- `--debug` - Show debug diagnostics with timestamps +- `--fail-on` - Fail with exit code 1 if findings match severities + +**Examples**: +```bash +# Basic scan +pathfinder scan --rules rules/security.py --project /app + +# Verbose scan +pathfinder scan -r rules/ -p . -v + +# CI-style failure +pathfinder scan -r rules/ -p . --fail-on=critical,high +``` + +--- + +### ci + +CI/CD optimized scanning with machine-readable output. + +**Usage**: +```bash +pathfinder ci --rules --project --output [flags] +``` + +**Required Flags**: +- `--rules, -r` - Path to rules file or directory +- `--project, -p` - Path to project to scan +- `--output, -o` - Output format: json, csv, sarif + +**Optional Flags**: +- `--verbose, -v` - Show progress and statistics (to stderr) +- `--debug` - Show debug diagnostics +- `--fail-on` - Fail with exit code 1 if findings match severities + +**Examples**: +```bash +# JSON output +pathfinder ci -r rules/ -p . -o json > results.json + +# SARIF for GitHub +pathfinder ci -r rules/ -p . -o sarif > results.sarif + +# CSV with failure control +pathfinder ci -r rules/ -p . -o csv --fail-on=critical > results.csv +``` + +--- + +### diagnose + +Diagnostic mode for debugging rule behavior. + +**Usage**: +```bash +pathfinder diagnose --rules --project [flags] +``` + +--- + +### version + +Display version information. + +**Usage**: +```bash +pathfinder version +``` + +--- + +## Output Format Reference + +### JSON Schema + +| Field | Type | Description | +|-------|------|-------------| +| `tool.name` | string | "Code Pathfinder" | +| `tool.version` | string | Tool version | +| `scan.target` | string | Project path | +| `scan.rules_executed` | int | Number of rules | +| `results[]` | array | Detection results | +| `results[].rule_id` | string | Rule identifier | +| `results[].severity` | string | critical/high/medium/low | +| `results[].location.file` | string | File path | +| `results[].location.line` | int | Line number | +| `results[].detection.type` | string | pattern/taint-local/taint-global | +| `summary.total` | int | Total findings | +| `summary.by_severity` | object | Count by severity | + +### CSV Columns + +1. severity +2. confidence +3. rule_id +4. rule_name +5. cwe +6. owasp +7. file +8. line +9. column +10. function +11. message +12. detection_type +13. detection_scope +14. source_line +15. sink_line +16. tainted_var +17. sink_call + +### SARIF 2.1.0 + +Compliant with [SARIF 2.1.0 specification](https://docs.oasis-open.org/sarif/sarif/v2.1.0/sarif-v2.1.0.html). + +Features: +- Rule metadata with help text +- Code flows for taint analysis +- Related locations for sources +- Security severity scores +- URI base ID for portable paths + +--- + +## Exit Code Reference + +| Code | Constant | Description | +|------|----------|-------------| +| 0 | ExitCodeSuccess | No findings, or findings without --fail-on match | +| 1 | ExitCodeFindings | Findings match at least one --fail-on severity | +| 2 | ExitCodeError | Configuration or execution error | + +### --fail-on Syntax + +```bash +--fail-on=[,...] +``` + +Valid severities: `critical`, `high`, `medium`, `low`, `info` + +Examples: +- `--fail-on=critical` - Fail only on critical +- `--fail-on=critical,high` - Fail on critical or high +- `--fail-on=critical,high,medium,low` - Fail on any finding diff --git a/sourcecode-parser/cmd/analyze.go b/sourcecode-parser/cmd/analyze.go deleted file mode 100644 index 836261f9..00000000 --- a/sourcecode-parser/cmd/analyze.go +++ /dev/null @@ -1,114 +0,0 @@ -package cmd - -import ( - "fmt" - "strings" - - "github.com/shivasurya/code-pathfinder/sourcecode-parser/graph" - "github.com/shivasurya/code-pathfinder/sourcecode-parser/graph/callgraph" - "github.com/shivasurya/code-pathfinder/sourcecode-parser/output" - "github.com/spf13/cobra" -) - -var analyzeCmd = &cobra.Command{ - Use: "analyze", - Short: "Analyze source code for security vulnerabilities using call graph", - Run: func(cmd *cobra.Command, _ []string) { - projectInput := cmd.Flag("project").Value.String() - - if projectInput == "" { - fmt.Println("Error: --project flag is required") - return - } - - fmt.Println("Building code graph...") - codeGraph := graph.Initialize(projectInput) - - fmt.Println("Building call graph and analyzing security patterns...") - cg, registry, patternRegistry, err := callgraph.InitializeCallGraph(codeGraph, projectInput, output.NewLogger(output.VerbosityDefault)) - if err != nil { - fmt.Println("Error building call graph:", err) - return - } - - fmt.Printf("Call graph built successfully: %d functions indexed\n", len(cg.Functions)) - fmt.Printf("Module registry: %d modules\n", len(registry.Modules)) - - // Run security analysis - matches := callgraph.AnalyzePatterns(cg, patternRegistry) - - if len(matches) == 0 { - fmt.Println("\n✓ No security issues found!") - return - } - - fmt.Printf("\n⚠ Found %d potential security issues:\n\n", len(matches)) - for i, match := range matches { - fmt.Printf("%d. [%s] %s\n", i+1, match.Severity, match.PatternName) - fmt.Printf(" Description: %s\n", match.Description) - fmt.Printf(" CWE: %s, OWASP: %s\n\n", match.CWE, match.OWASP) - - // Display source information - if match.SourceFQN != "" { - if match.SourceCall != "" { - fmt.Printf(" Source: %s() calls %s()\n", match.SourceFQN, match.SourceCall) - } else { - fmt.Printf(" Source: %s\n", match.SourceFQN) - } - if match.SourceFile != "" { - fmt.Printf(" at %s:%d\n", match.SourceFile, match.SourceLine) - if match.SourceCode != "" { - printCodeSnippet(match.SourceCode, int(match.SourceLine)) - } - } - fmt.Println() - } - - // Display sink information - if match.SinkFQN != "" { - if match.SinkCall != "" { - fmt.Printf(" Sink: %s() calls %s()\n", match.SinkFQN, match.SinkCall) - } else { - fmt.Printf(" Sink: %s\n", match.SinkFQN) - } - if match.SinkFile != "" { - fmt.Printf(" at %s:%d\n", match.SinkFile, match.SinkLine) - if match.SinkCode != "" { - printCodeSnippet(match.SinkCode, int(match.SinkLine)) - } - } - fmt.Println() - } - - // Display data flow path - if len(match.DataFlowPath) > 0 { - fmt.Printf(" Data flow path (%d steps):\n", len(match.DataFlowPath)) - for j, step := range match.DataFlowPath { - if j == 0 { - fmt.Printf(" %s (source)\n", step) - } else if j == len(match.DataFlowPath)-1 { - fmt.Printf(" └─> %s (sink)\n", step) - } else { - fmt.Printf(" └─> %s\n", step) - } - } - fmt.Println() - } - } - }, -} - -func printCodeSnippet(code string, startLine int) { - lines := strings.Split(code, "\n") - for i, line := range lines { - if line != "" { - fmt.Printf(" %4d | %s\n", startLine+i, line) - } - } -} - -func init() { - rootCmd.AddCommand(analyzeCmd) - analyzeCmd.Flags().StringP("project", "p", "", "Project directory to analyze (required)") - analyzeCmd.MarkFlagRequired("project") //nolint:all -} diff --git a/sourcecode-parser/cmd/query.go b/sourcecode-parser/cmd/query.go deleted file mode 100644 index 16c3c9e6..00000000 --- a/sourcecode-parser/cmd/query.go +++ /dev/null @@ -1,104 +0,0 @@ -package cmd - -import ( - "fmt" - "log" - - "github.com/shivasurya/code-pathfinder/sourcecode-parser/dsl" - "github.com/shivasurya/code-pathfinder/sourcecode-parser/graph" - "github.com/shivasurya/code-pathfinder/sourcecode-parser/graph/callgraph/builder" - "github.com/shivasurya/code-pathfinder/sourcecode-parser/graph/callgraph/core" - "github.com/shivasurya/code-pathfinder/sourcecode-parser/graph/callgraph/registry" - "github.com/shivasurya/code-pathfinder/sourcecode-parser/output" - "github.com/spf13/cobra" -) - -var queryCmd = &cobra.Command{ - Use: "query", - Short: "Query code using Python DSL rules", - Long: `Query codebase using Python DSL security rules. - -Similar to scan but designed for ad-hoc queries and exploration. - -Examples: - # Query with a single rule - pathfinder query --rules my_rule.py --project /path/to/project - - # Query specific files - pathfinder query --rules rule.py --project /path/to/file.py`, - RunE: func(cmd *cobra.Command, args []string) error { - rulesPath, _ := cmd.Flags().GetString("rules") - projectPath, _ := cmd.Flags().GetString("project") - - if rulesPath == "" { - return fmt.Errorf("--rules flag is required") - } - - if projectPath == "" { - return fmt.Errorf("--project flag is required") - } - - // Build code graph (AST) - log.Printf("Building code graph from %s...\n", projectPath) - codeGraph := graph.Initialize(projectPath) - if len(codeGraph.Nodes) == 0 { - return fmt.Errorf("no source files found in project") - } - log.Printf("Code graph built: %d nodes\n", len(codeGraph.Nodes)) - - // Build module registry - log.Printf("Building module registry...\n") - moduleRegistry, err := registry.BuildModuleRegistry(projectPath) - if err != nil { - log.Printf("Warning: failed to build module registry: %v\n", err) - moduleRegistry = core.NewModuleRegistry() - } - - // Build callgraph - log.Printf("Building callgraph...\n") - cg, err := builder.BuildCallGraph(codeGraph, moduleRegistry, projectPath, output.NewLogger(output.VerbosityDefault)) - if err != nil { - return fmt.Errorf("failed to build callgraph: %w", err) - } - log.Printf("Callgraph built: %d functions, %d call sites\n", - len(cg.Functions), countTotalCallSites(cg)) - - // Load Python DSL rules - log.Printf("Loading rules from %s...\n", rulesPath) - loader := dsl.NewRuleLoader(rulesPath) - rules, err := loader.LoadRules() - if err != nil { - return fmt.Errorf("failed to load rules: %w", err) - } - log.Printf("Loaded %d rules\n", len(rules)) - - // Execute rules against callgraph - log.Printf("\n=== Query Results ===\n") - totalDetections := 0 - for _, rule := range rules { - detections, err := loader.ExecuteRule(&rule, cg) - if err != nil { - log.Printf("Error executing rule %s: %v\n", rule.Rule.ID, err) - continue - } - - if len(detections) > 0 { - printDetections(rule, detections) - totalDetections += len(detections) - } - } - - log.Printf("\n=== Query Complete ===\n") - log.Printf("Total matches: %d\n", totalDetections) - - return nil - }, -} - -func init() { - rootCmd.AddCommand(queryCmd) - queryCmd.Flags().StringP("rules", "r", "", "Path to Python DSL rules file (required)") - queryCmd.Flags().StringP("project", "p", "", "Path to project directory to query (required)") - queryCmd.MarkFlagRequired("rules") - queryCmd.MarkFlagRequired("project") -} diff --git a/sourcecode-parser/cmd/query_test.go b/sourcecode-parser/cmd/query_test.go deleted file mode 100644 index 3f243414..00000000 --- a/sourcecode-parser/cmd/query_test.go +++ /dev/null @@ -1,20 +0,0 @@ -package cmd - -import ( - "testing" - - "github.com/stretchr/testify/assert" -) - -func TestQueryCommand(t *testing.T) { - cmd := queryCmd - - assert.NotNil(t, cmd) - assert.Equal(t, "query", cmd.Use) - assert.Equal(t, "Query code using Python DSL rules", cmd.Short) - - // Test execution returns error when required flags are missing - err := cmd.RunE(cmd, []string{}) - assert.Error(t, err) - assert.Contains(t, err.Error(), "required") -} diff --git a/sourcecode-parser/main_test.go b/sourcecode-parser/main_test.go index 73530597..d716ff0f 100644 --- a/sourcecode-parser/main_test.go +++ b/sourcecode-parser/main_test.go @@ -23,7 +23,7 @@ func TestExecute(t *testing.T) { { name: "Successful execution", mockExecuteErr: nil, - expectedOutput: "Code Pathfinder is designed for identifying vulnerabilities in source code.\n\nUsage:\n pathfinder [command]\n\nAvailable Commands:\n analyze Analyze source code for security vulnerabilities using call graph\n ci CI mode with SARIF, JSON, or CSV output for CI/CD integration\n completion Generate the autocompletion script for the specified shell\n diagnose Validate intra-procedural taint analysis against LLM ground truth\n help Help about any command\n query Query code using Python DSL rules\n resolution-report Generate a diagnostic report on call resolution statistics\n scan Scan code for security vulnerabilities using Python DSL rules\n version Print the version and commit information\n\nFlags:\n --disable-metrics Disable metrics collection\n -h, --help help for pathfinder\n --verbose Verbose output\n\nUse \"pathfinder [command] --help\" for more information about a command.\n", + expectedOutput: "Code Pathfinder is designed for identifying vulnerabilities in source code.\n\nUsage:\n pathfinder [command]\n\nAvailable Commands:\n ci CI mode with SARIF, JSON, or CSV output for CI/CD integration\n completion Generate the autocompletion script for the specified shell\n diagnose Validate intra-procedural taint analysis against LLM ground truth\n help Help about any command\n resolution-report Generate a diagnostic report on call resolution statistics\n scan Scan code for security vulnerabilities using Python DSL rules\n version Print the version and commit information\n\nFlags:\n --disable-metrics Disable metrics collection\n -h, --help help for pathfinder\n --verbose Verbose output\n\nUse \"pathfinder [command] --help\" for more information about a command.\n", expectedExit: 0, }, }