Skip to content

Conversation

@shivasurya
Copy link
Owner

Summary

Implements Phase 1 of type inference to dramatically improve Python callsite resolution rates. This PR adds the ability to track variable types through literal assignments and resolve method calls on Python builtin types.

Changes

  • Type Inference Engine: Core data structures for tracking variable types and function scopes (type_inference.go)
  • Builtin Registry: Comprehensive registry of Python builtin types (str, list, dict, set, tuple, int, float, bool, bytes) with all their methods (builtin_registry.go)
  • Variable Extraction: AST traversal to extract variable assignments and infer types from literals (variable_extraction.go)
  • Integration: Modified call resolution in builder.go to use type inference for variable.method() patterns
  • Tests: 45 new test cases with 100% coverage on core components

Performance Impact

Test Project Results

  • Resolution rate: 4.3% → 91.3% (+87 percentage points)
  • Resolved calls: 1 → 21 (+20 calls)
  • attribute_chain failures: 18 → 1 (-94.4%)

Real-World Project (label-studio)

  • Resolution rate: 62.9% → 64.1% (+1.2 percentage points)
  • Resolved calls: 11,893 → 12,114 (+221 calls)
  • attribute_chain failures: 3,535 → 3,357 (-178 calls)
  • orm_pattern failures: 1,310 → 1,276 (-34 calls)

Technical Details

  • Literal type inference automatically detects types from strings, numbers, lists, dicts, sets, tuples
  • Function-scoped variable binding tracking
  • Confidence scores (0.0-1.0) for type inference quality
  • Backward compatible with existing resolution logic (legacy path preserved)

Testing

All existing tests pass. New test coverage:

  • type_inference_test.go: 13 test cases (100% coverage)
  • builtin_registry_test.go: 16 test cases (100% coverage)
  • variable_extraction_test.go: 11 test cases (83-93% coverage)
  • integration_type_inference_test.go: 5 integration tests

Future Work (Phase 2)

  • Return type inference from function definitions
  • Parameter type annotations support
  • Method chaining resolution
  • Class attribute type tracking
  • Cross-function type propagation

shivasurya and others added 4 commits October 30, 2025 07:56
Add foundational data structures for type inference:
- TypeInfo: tracks type FQN, confidence, and source
- VariableBinding: tracks variable types within scopes
- FunctionScope: maintains type environment per function
- TypeInferenceEngine: coordinates type inference across codebase

Tests: 100% coverage with 13 test cases

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Add comprehensive builtin type registry for Python types:
- BuiltinRegistry with all Python builtin types (str, list, dict, set, tuple, int, float, bool, bytes)
- Method definitions with return types for each builtin
- InferLiteralType for automatic type inference from literals
- Support for numeric literals (int, float, scientific notation, hex, octal, binary)
- Support for collection literals (list, dict, set, tuple)
- Support for string and bytes literals

Integrate builtin registry with TypeInferenceEngine

Tests: 100% coverage with 16 test cases covering all types and methods

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Add AST traversal to extract variable assignments for type inference:
- ExtractVariableAssignments traverses Python AST to find assignments
- processAssignment extracts type from RHS expression
- inferTypeFromExpression handles literal type inference
- Supports string, numeric, collection, bool, and None literals
- Tracks variable bindings per function scope
- Records source locations for each assignment

Integration with type inference engine:
- Populates function scopes with variable bindings
- Handles nested function scopes
- Supports variable reassignment (last wins)

Tests: 83-93% coverage with 11 test cases covering all literal types

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Integrate type inference engine with call graph resolution:
- Initialize TypeInferenceEngine in BuildCallGraph
- Extract variable assignments during graph building
- Use type inference to resolve variable.method() calls
- Resolve builtin method calls (str.upper, list.append, dict.keys, etc.)
- Fallback to legacy resolution for backward compatibility

Type-aware resolution logic:
- Check variable bindings in function scope
- Resolve builtin methods via BuiltinRegistry
- Support user-defined type methods
- Maintains 100% backward compatibility with existing tests

Integration tests: 5 test cases covering string, list, dict methods and edge cases
- TestTypeInference_StringMethods: data.upper() resolution
- TestTypeInference_ListMethods: numbers.append(), numbers.count()
- TestTypeInference_DictMethods: config.keys(), config.values()
- TestTypeInference_MultipleVariables: mixed type resolution
- TestTypeInference_WithoutTypeInfo: graceful fallback

Backward compatibility:
- resolveCallTargetLegacy for tests without type engine
- All existing tests pass unchanged

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@safedep
Copy link

safedep bot commented Oct 31, 2025

SafeDep Report Summary

Green Malicious Packages Badge Green Vulnerable Packages Badge Green Risky License Badge

No dependency changes detected. Nothing to scan.

This report is generated by SafeDep Github App

@shivasurya shivasurya self-assigned this Oct 31, 2025
@shivasurya shivasurya added enhancement New feature or request go Pull requests that update go code labels Oct 31, 2025
@codecov
Copy link

codecov bot commented Oct 31, 2025

Codecov Report

❌ Patch coverage is 91.72794% with 45 lines in your changes missing coverage. Please review.
✅ Project coverage is 76.03%. Comparing base (2a0be94) to head (344e7f7).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
...code-parser/graph/callgraph/variable_extraction.go 84.31% 14 Missing and 10 partials ⚠️
sourcecode-parser/graph/callgraph/builder.go 66.07% 14 Missing and 5 partials ⚠️
...rcecode-parser/graph/callgraph/builtin_registry.go 99.37% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #334      +/-   ##
==========================================
+ Coverage   73.87%   76.03%   +2.15%     
==========================================
  Files          35       38       +3     
  Lines        3560     4102     +542     
==========================================
+ Hits         2630     3119     +489     
- Misses        841      877      +36     
- Partials       89      106      +17     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

- Convert if-else chain to switch statement in isNumericLiteral
- Add period to TODO comment
- Add nolint directives for disabled test function

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@shivasurya shivasurya merged commit 8b53631 into main Oct 31, 2025
5 checks passed
@shivasurya shivasurya deleted the shiva/callgraph-type-inference branch October 31, 2025 01:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request go Pull requests that update go code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants