feat(callgraph): Python Type Inference for Improved Call Resolution #334

shivasurya · 2025-10-31T01:04:47Z

Summary

Implements Phase 1 of type inference to dramatically improve Python callsite resolution rates. This PR adds the ability to track variable types through literal assignments and resolve method calls on Python builtin types.

Changes

Type Inference Engine: Core data structures for tracking variable types and function scopes (type_inference.go)
Builtin Registry: Comprehensive registry of Python builtin types (str, list, dict, set, tuple, int, float, bool, bytes) with all their methods (builtin_registry.go)
Variable Extraction: AST traversal to extract variable assignments and infer types from literals (variable_extraction.go)
Integration: Modified call resolution in builder.go to use type inference for variable.method() patterns
Tests: 45 new test cases with 100% coverage on core components

Performance Impact

Test Project Results

Resolution rate: 4.3% → 91.3% (+87 percentage points)
Resolved calls: 1 → 21 (+20 calls)
attribute_chain failures: 18 → 1 (-94.4%)

Real-World Project (label-studio)

Resolution rate: 62.9% → 64.1% (+1.2 percentage points)
Resolved calls: 11,893 → 12,114 (+221 calls)
attribute_chain failures: 3,535 → 3,357 (-178 calls)
orm_pattern failures: 1,310 → 1,276 (-34 calls)

Technical Details

Literal type inference automatically detects types from strings, numbers, lists, dicts, sets, tuples
Function-scoped variable binding tracking
Confidence scores (0.0-1.0) for type inference quality
Backward compatible with existing resolution logic (legacy path preserved)

Testing

All existing tests pass. New test coverage:

type_inference_test.go: 13 test cases (100% coverage)
builtin_registry_test.go: 16 test cases (100% coverage)
variable_extraction_test.go: 11 test cases (83-93% coverage)
integration_type_inference_test.go: 5 integration tests

Future Work (Phase 2)

Return type inference from function definitions
Parameter type annotations support
Method chaining resolution
Class attribute type tracking
Cross-function type propagation

Add foundational data structures for type inference: - TypeInfo: tracks type FQN, confidence, and source - VariableBinding: tracks variable types within scopes - FunctionScope: maintains type environment per function - TypeInferenceEngine: coordinates type inference across codebase Tests: 100% coverage with 13 test cases 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Add comprehensive builtin type registry for Python types: - BuiltinRegistry with all Python builtin types (str, list, dict, set, tuple, int, float, bool, bytes) - Method definitions with return types for each builtin - InferLiteralType for automatic type inference from literals - Support for numeric literals (int, float, scientific notation, hex, octal, binary) - Support for collection literals (list, dict, set, tuple) - Support for string and bytes literals Integrate builtin registry with TypeInferenceEngine Tests: 100% coverage with 16 test cases covering all types and methods 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Add AST traversal to extract variable assignments for type inference: - ExtractVariableAssignments traverses Python AST to find assignments - processAssignment extracts type from RHS expression - inferTypeFromExpression handles literal type inference - Supports string, numeric, collection, bool, and None literals - Tracks variable bindings per function scope - Records source locations for each assignment Integration with type inference engine: - Populates function scopes with variable bindings - Handles nested function scopes - Supports variable reassignment (last wins) Tests: 83-93% coverage with 11 test cases covering all literal types 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Integrate type inference engine with call graph resolution: - Initialize TypeInferenceEngine in BuildCallGraph - Extract variable assignments during graph building - Use type inference to resolve variable.method() calls - Resolve builtin method calls (str.upper, list.append, dict.keys, etc.) - Fallback to legacy resolution for backward compatibility Type-aware resolution logic: - Check variable bindings in function scope - Resolve builtin methods via BuiltinRegistry - Support user-defined type methods - Maintains 100% backward compatibility with existing tests Integration tests: 5 test cases covering string, list, dict methods and edge cases - TestTypeInference_StringMethods: data.upper() resolution - TestTypeInference_ListMethods: numbers.append(), numbers.count() - TestTypeInference_DictMethods: config.keys(), config.values() - TestTypeInference_MultipleVariables: mixed type resolution - TestTypeInference_WithoutTypeInfo: graceful fallback Backward compatibility: - resolveCallTargetLegacy for tests without type engine - All existing tests pass unchanged 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

safedep · 2025-10-31T01:04:50Z

SafeDep Report Summary

No dependency changes detected. Nothing to scan.

_{This report is generated by SafeDep Github App}

codecov · 2025-10-31T01:05:53Z

Codecov Report

❌ Patch coverage is 91.72794% with 45 lines in your changes missing coverage. Please review.
✅ Project coverage is 76.03%. Comparing base (2a0be94) to head (344e7f7).
⚠️ Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
...code-parser/graph/callgraph/variable_extraction.go	84.31%	14 Missing and 10 partials ⚠️
sourcecode-parser/graph/callgraph/builder.go	66.07%	14 Missing and 5 partials ⚠️
...rcecode-parser/graph/callgraph/builtin_registry.go	99.37%	1 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #334      +/-   ##
==========================================
+ Coverage   73.87%   76.03%   +2.15%     
==========================================
  Files          35       38       +3     
  Lines        3560     4102     +542     
==========================================
+ Hits         2630     3119     +489     
- Misses        841      877      +36     
- Partials       89      106      +17

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

- Convert if-else chain to switch statement in isNumericLiteral - Add period to TODO comment - Add nolint directives for disabled test function 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

shivasurya and others added 4 commits October 30, 2025 07:56

shivasurya self-assigned this Oct 31, 2025

shivasurya added enhancement New feature or request go Pull requests that update go code labels Oct 31, 2025

shivasurya merged commit 8b53631 into main Oct 31, 2025
5 checks passed

shivasurya deleted the shiva/callgraph-type-inference branch October 31, 2025 01:17

shivasurya mentioned this pull request Oct 31, 2025

feat(callgraph): Phase 2 - Complete Type Inference with Inter-Procedural Propagation #335

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(callgraph): Python Type Inference for Improved Call Resolution #334

feat(callgraph): Python Type Inference for Improved Call Resolution #334

Uh oh!

shivasurya commented Oct 31, 2025

Uh oh!

safedep bot commented Oct 31, 2025 •

edited

Loading

Uh oh!

codecov bot commented Oct 31, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat(callgraph): Python Type Inference for Improved Call Resolution #334

feat(callgraph): Python Type Inference for Improved Call Resolution #334

Uh oh!

Conversation

shivasurya commented Oct 31, 2025

Summary

Changes

Performance Impact

Test Project Results

Real-World Project (label-studio)

Technical Details

Testing

Future Work (Phase 2)

Uh oh!

safedep bot commented Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

SafeDep Report Summary

Uh oh!

codecov bot commented Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

safedep bot commented Oct 31, 2025 •

edited

Loading

codecov bot commented Oct 31, 2025 •

edited

Loading