Skip to content

Inconsistent inference results unless jedi.Script is reinitialized per token #2072

@jatinarora96

Description

@jatinarora96

Hi,

I'm working on extracting references from a Python codebase using Jedi. For each token in a file, I use jedi.infer(line, column) to get inference results.

Here's a simplified version of my setup:

class JediInferenceEngine:
    """Performs Jedi inference on a given Python file."""

    def __init__(self, file_path, project_root):
        self.file_path = file_path
        self.project_root = project_root
        try:
            raw_content = file_path.read_text(encoding="utf-8")
            self.source = raw_content.encode('ascii', 'ignore').decode('ascii')
            self.source_lines = self.source.splitlines()
            self.script = jedi.Script(
                code=self.source,
                path=str(file_path),
                project=jedi.Project(path=str(project_root))
            )
        except Exception as e:
            logger.error(f"Failed to initialize JediInferenceEngine for {file_path}: {e}")
            raise

    def safe_jedi_infer(self, line, col, token_str=None):
        try:
            if line < 1 or line > len(self.source_lines):
                return []
            line_content = self.source_lines[line - 1]
            col = max(0, min(col, len(line_content)))
            return self.script.infer(line=line, column=col)
        except Exception as e:
            logger.warning(f"Jedi inference failed at {self.file_path}:{line}:{col} - {e}")
            return []

In my analysis loop, I initially reused the same JediInferenceEngine instance for all tokens in a file. However, I noticed that some references were missing. When I reinitialize the JediInferenceEngine for each token, the inferences are accurate and complete.

for token in tokens:
    self.jedi_engine = JediInferenceEngine(self.file_path, self.project_root)
    results = self.jedi_engine.safe_jedi_infer(line, col, token_str)
    ...

This behavior seems unexpected. Is there any internal caching or state in jedi.Script that could cause inference results to vary unless reinitialized? Or am I missing something in how Jedi is intended to be used?

Any insights would be appreciated!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions