Skip to content

Conversation

@tsanders-rh
Copy link
Contributor

@tsanders-rh tsanders-rh commented Nov 20, 2025

Overview

This PR implements a new import-based search algorithm for the nodejs.referenced capability, replacing the previous workspace/symbol approach which had significant limitations with TypeScript LSP servers.

Problem Statement

The nodejs.referenced capability was not working reliably for the following reasons:

  1. LSP Limitation: TypeScript LSP's workspace/symbol only returns top-level declarations in each file
  2. Export Structure: Exported components like PatternFly's Card, Button, etc. are not top-level declarations in their source packages - they're exported from re-export barrel files
  3. False Negatives: This caused imported symbols to not be detected, resulting in rules failing to match
  4. Inefficient Workaround: Previous solution used combo rules combining nodejs.referenced AND builtin.filecontent, which was:
    • Semantically incorrect (text pattern matching instead of semantic analysis)
    • Slower (required both LSP calls and regex scans)
    • Fragile (required maintaining regex patterns for each symbol)

Solution: Import-Based Search

This PR implements a three-step algorithm that leverages the fact that developers must import symbols before using them:

Algorithm

  1. FAST SCAN (no LSP required)

    • Scan all TypeScript/JavaScript files for import statements containing the pattern
    • Uses regex matching on normalized content
    • Handles multiline imports via preprocessing
    • Supports:
      • Named imports: import { Card } from "package"
      • Default imports: import Card from "package"
      • Multiple symbols: import { Card, CardBody } from "package"
    • Correctly identifies word boundaries to avoid false matches
  2. GET DEFINITIONS (LSP textDocument/definition)

    • For each import location found, use LSP to locate where the symbol is actually defined
    • Opens files in LSP as needed
    • Tracks processed definitions to avoid duplicates
  3. GET REFERENCES (LSP textDocument/references)

    • For each definition, use LSP to find all usage locations in the workspace
    • Filters results to workspace folders only
    • Excludes dependency folders
    • Deduplicates incidents

Why This Works Better

  • Accurate: Finds actual semantic usage, not just text patterns
  • Faster: Avoids expensive workspace/symbol calls; regex scan is much faster
  • Comprehensive: Correctly handles all import patterns including multiline
  • Maintainable: No need for per-symbol regex patterns

Implementation Details

New Code (nodejs/service_client.go)

Type Definitions:

// ImportLocation tracks where a symbol is imported
type ImportLocation struct {
    fileURI  string
    langID   string
    position protocol.Position
    line     string
}

// fileInfo tracks a source file
type fileInfo struct {
    path   string
    langID string
}

Key Methods:

  1. EvaluateReferenced() - Main entry point (~200 lines)

    • Comprehensive docstring explaining the algorithm
    • Scans for imports
    • Opens files in LSP
    • Gets definitions and references
    • Filters and deduplicates results
  2. findImportStatements() - Import detection (~95 lines)

    • Regex-based search for import statements
    • Handles both named and default imports
    • Correctly identifies word boundaries
    • Returns positions for LSP definition lookup
  3. normalizeMultilineImports() - Preprocessing (~90 lines)

    • Converts multiline imports to single lines
    • Preserves string literals
    • Handles brace depth tracking
    • Enables regex matching on formatted code
  4. Helper Functions:

    • isIdentifierChar() - Validates JavaScript identifier characters
    • min() - Utility function
  5. Method Overrides:

    • GetDependencies() - Returns empty map (nodejs doesn't use external dependency providers)
    • GetDependenciesDAG() - Returns empty map

Enhanced Logging (base_service_client.go)

  • Added structured logging for workspace/symbol calls
  • Logs query, result count, and error states
  • Removed debug fmt.Printf statements
  • Helps diagnose fallback behavior

Testing

Tested with PatternFly v5→v6 migration ruleset on tackle2-ui codebase:

  • 56 rules for React component migrations
  • Successfully detects imports and references for all PatternFly components
  • No longer requires builtin.filecontent workarounds
  • Verified with components: Card, Button, Chip, Badge, Label, etc.

Example Rule (Before vs After)

Before (Combo Rule):

when:
  and:
  - nodejs.referenced:
      pattern: Card
  - builtin.filecontent:
      pattern: import.*\{[^}]*\bCard\b[^}]*\}.*from ['"]@patternfly/react-core['"]
      filePattern: \.(j|t)sx?$

After (Clean):

when:
  nodejs.referenced:
    pattern: Card

Benefits

  1. Accurate Detection - Finds actual semantic usage regardless of export structure
  2. Faster Execution - Avoids expensive workspace/symbol calls
  3. Eliminates Workarounds - No need for combo rules with builtin.filecontent
  4. Better Semantics - True semantic analysis instead of text pattern matching
  5. Comprehensive Docs - Detailed docstrings for all major methods
  6. Maintainable - Clear algorithm, well-structured code

Files Changed

  • external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go - Main implementation (~480 lines added)
  • lsp/base_service_client/base_service_client.go - Enhanced logging
  • lsp/base_service_client/cmd_dialer.go - Minor updates
  • external-providers/dotnet-external-provider/go.mod - Dependency updates
  • external-providers/dotnet-external-provider/go.sum - Dependency updates
  • external-providers/golang-dependency-provider/go.mod - Dependency updates
  • external-providers/golang-dependency-provider/go.sum - Dependency updates

Total: 7 files changed, 538 insertions(+), 104 deletions(-)

Impact on Existing Rules

This change enables existing nodejs.referenced rules to work correctly. Rules that were using combo patterns with builtin.filecontent can be simplified to use only nodejs.referenced.

No breaking changes - existing rules continue to work, but can now be simplified.

Next Steps

After this PR is merged:

  1. Update analyzer-rule-generator to stop creating combo rules (guide available)
  2. Simplify existing rulesets to remove builtin.filecontent workarounds
  3. Consider applying similar import-based approach to other language providers if needed

Related Issues

Fixes issues with PatternFly v5→v6 migration rules where components were not being detected.


Signed-off-by: Todd Sanders [email protected]

Summary by CodeRabbit

  • Bug Fixes

    • Prevented potential crash when closing process dialers by adding nil-safety check.
  • New Features

    • Added import-based evaluation for Node.js to improve cross-file reference discovery.
    • Exposed lightweight dependency query endpoints that currently return empty structures.
  • Improvements

    • Enhanced LSP/symbol resolution logging, retry/backoff, deduplication, and final summary reporting.
    • Ensured workspace-folder sync and resource cleanup after processing.
  • Tests

    • Added comprehensive tests for import scanning and normalization.
  • Data

    • Refined demo output to deduplicate and focus sample incident references.

✏️ Tip: You can customize this high-level summary in your review settings.

This commit implements a new import-based search algorithm for the
nodejs.referenced capability, replacing the previous workspace/symbol
approach which had significant limitations with TypeScript LSP servers.

Problem
-------
The nodejs.referenced capability was not working reliably because:
1. TypeScript LSP workspace/symbol only returns top-level declarations
2. Exported components like PatternFly's Card, Button, etc. are not
   top-level declarations in their source packages
3. This caused false negatives where imported symbols were not detected
4. Previous workaround used combo rules with builtin.filecontent, but
   this was inefficient and semantically incorrect

Solution
--------
Implement import-based search using a three-step algorithm:

1. FAST SCAN: Scan all TypeScript/JavaScript files for import statements
   containing the pattern using regex (no LSP required)
   - Handles multiline imports via normalization
   - Supports named imports: import { Card } from "package"
   - Supports default imports: import Card from "package"
   - Correctly identifies word boundaries to avoid false matches

2. GET DEFINITIONS: For each import found, use LSP textDocument/definition
   to locate where the symbol is actually defined

3. GET REFERENCES: For each definition, use LSP textDocument/references
   to find all usage locations in the workspace

This approach is both faster and more accurate than workspace/symbol
search, as it leverages the fact that developers must import symbols
before using them.

Implementation Details
---------------------
- Added ImportLocation and fileInfo types to track file metadata
- Implemented findImportStatements() for regex-based import detection
- Implemented normalizeMultilineImports() to handle formatted code
- Added helper functions for identifier validation
- Added GetDependencies/GetDependenciesDAG overrides to prevent errors
- Enhanced logging in base service client for better debugging
- Removed debug printf statements, using structured logging only

Files Changed
-------------
- nodejs/service_client.go: Main implementation (~480 lines added)
- base_service_client.go: Enhanced logging for workspace/symbol calls
- cmd_dialer.go: Minor updates
- go.mod/go.sum: Dependency updates

Testing
-------
Tested with PatternFly v5→v6 migration ruleset on tackle2-ui codebase:
- 56 rules for React component migrations
- Successfully detects imports and references for all PatternFly components
- No longer requires builtin.filecontent workarounds

Benefits
--------
1. Accurate detection of imported symbols regardless of export structure
2. Faster execution by avoiding expensive workspace/symbol calls
3. Eliminates need for combo rules with builtin.filecontent
4. Better semantic correctness - finds actual usage, not just text patterns
5. Comprehensive documentation with detailed docstrings

Signed-off-by: Todd Sanders <[email protected]>
Signed-off-by: tsanders <[email protected]>
@coderabbitai
Copy link

coderabbitai bot commented Nov 20, 2025

Walkthrough

Adds an import-based evaluation path to NodeServiceClient that scans TS/JS files for imports, normalizes multiline imports, resolves definitions and references via LSP with retry/backoff, deduplicates and converts references to incidents, exposes testing helpers and two dependency APIs (stubs), improves base client logging and CmdDialer Close safety, and adds import tests and demo updates.

Changes

Cohort / File(s) Summary
NodeServiceClient — import-based evaluation
external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go
Adds ImportLocation (exported) and internal fileInfo; implements EvaluateReferenced with fast regex import scanning, multiline import normalization, LSP-based definition resolution with exponential backoff and retries, references gathering, deduplication, workspace filtering, incident conversion, detailed per-step logging, cleanup, and summary logging. Adds AnalysisMode field and GetDependencies/GetDependenciesDAG stubs returning empty maps; exposes testing wrappers.
Import utilities tests
external-providers/generic-external-provider/pkg/server_configurations/nodejs/import_search_test.go
New comprehensive tests and benchmark for multiline import normalization, import-statement finding, identifier character checks, word-boundary correctness, edge cases, and performance.
Base service client — logging
lsp/base_service_client/base_service_client.go
Replaces plain prints with structured sc.Log.Error on workspace/symbol errors, logs symbol counts on success, and logs fallback when workspace/symbol is unsupported.
CmdDialer — process handling
lsp/base_service_client/cmd_dialer.go
Adds nil-check in CmdDialer.Close to avoid killing a nil/non-started process and handle close gracefully.
Demo output
demo-output.yaml
Removes test_a.ts references/duplicates, consolidates incidents to test_b.ts, and updates unmatched list to include node-sample-rule-002.

Sequence Diagram(s)

sequenceDiagram
    participant Caller as NodeServiceClient.EvaluateReferenced
    participant Scanner as Import Scanner (regex)
    participant LSP as LSP Server
    participant Builder as Deduplicator / Incident Builder

    Caller->>Scanner: Scan workspace TS/JS files for imports
    activate Scanner
    Scanner->>Scanner: Normalize multiline imports
    Scanner-->>Caller: Return ImportLocations[]
    deactivate Scanner

    loop per ImportLocation
        Caller->>LSP: textDocument/didOpen (open file)
        Caller->>LSP: textDocument/definition (with backoff/retries)
        LSP-->>Caller: Definition locations
        Caller->>Caller: Track processed definitions (dedupe)
        Caller->>LSP: textDocument/references
        LSP-->>Caller: References[]
        Caller->>Caller: Filter to workspace & exclude deps
        Caller->>Builder: Convert references → incidents (dedupe)
    end

    Caller->>Caller: Close opened files & cleanup
    Caller->>Caller: Emit summary log (imports, defs, refs, incidents)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • service_client.go contains dense, new logic (regex scanning, multiline normalization, LSP interactions, exponential backoff, deduplication, logging) requiring careful review.
  • import_search_test.go should be checked for correct position expectations and edge-case coverage.
  • Verify LSP open/close lifecycle, retry/backoff implementation, and duplicate-tracking correctness.

Possibly related PRs

Suggested reviewers

  • pranavgaikwad
  • shawn-hurley

Poem

🐰
I hop through imports, quick and spry,
scanning lines where symbols lie.
I chase definitions, gather threads,
dedupe footprints, tuck them in beds.
Incidents found — a carrot for I. 🥕

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main feature: implementing import-based search for the nodejs.referenced capability, which is the core objective of this PR.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@tsanders-rh tsanders-rh changed the title Implement import-based search for nodejs.referenced capability ✨ Implement import-based search for nodejs.referenced capability Nov 20, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (1)

183-218: Add guards for empty WorkspaceFolders and unify workspace root usage

Three unsafe slice accesses confirmed at lines 183, 343, and 411:

  • Line 183: sc.Config.WorkspaceFolders[0] accessed without bounds check
  • Lines 343 & 411: sc.BaseConfig.WorkspaceFolders[0] accessed without bounds check

These will panic if the slices are empty. Additionally, the code inconsistently references both sc.Config.WorkspaceFolders[0] (for scanning) and sc.BaseConfig.WorkspaceFolders[0] (for filtering), risking workspace misclassification.

Implement the suggested fix: determine workspace root once with proper guards, then reuse the same value in both scanning and filtering logic.

🧹 Nitpick comments (3)
lsp/base_service_client/cmd_dialer.go (1)

71-84: Close now handles nil Process, but consider also guarding nil Cmd

The new Process == nil guard is good to avoid panics when the command never started, but Close will still panic if rwc.Cmd itself is nil (e.g., a manually constructed CmdDialer). A small extra check would make this method fully robust:

func (rwc *CmdDialer) Close() error {
-	// Check if process was started before trying to kill it
-	if rwc.Cmd.Process == nil {
+	// Check if command/process were started before trying to kill it
+	if rwc.Cmd == nil || rwc.Cmd.Process == nil {
 		// Process was never started or already exited
 		return nil
 	}
external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (2)

258-283: Avoid time.Sleep per import location to keep runtime bounded

The fixed time.Sleep(100 * time.Millisecond) is currently executed for every import location, even when multiple imports come from the same file. On large projects this can easily add seconds of latency.

You only need to delay after opening a file in the LSP, so you can move the sleep into the !openedFiles[importLoc.fileURI] branch:

	for _, importLoc := range importLocations {
 		// Open the file if not already open
 		if !openedFiles[importLoc.fileURI] {
@@
-			err = didOpen(importLoc.fileURI, importLoc.langID, text)
+			err = didOpen(importLoc.fileURI, importLoc.langID, text)
 			if err != nil {
 				sc.Log.V(1).Info("Failed to open file in LSP", "file", importLoc.fileURI, "error", err)
 				continue
 			}
 			openedFiles[importLoc.fileURI] = true
+
+			// Small delay to let LSP index the newly opened file
+			time.Sleep(100 * time.Millisecond)
 		}
 
-		// Small delay to let LSP index the file
-		time.Sleep(100 * time.Millisecond)
-
 		// Get definition of the imported symbol using textDocument/definition

This keeps behavior the same while preventing per-import backoff.


475-571: Import scan works for common cases; consider documenting or extending edge cases

The findImportStatements + isIdentifierChar combo looks solid for the main patterns (named/default imports, multiline via normalization, word-boundary checks). A few behavioral details worth calling out:

  • The regex doesn’t cover some valid forms, e.g.:
    • import * as Foo from "pkg"
    • import type { Foo } from "pkg"
    • Default imports that include $ (e.g. $http) because (\w+) won’t match $.
  • You already do a stricter word-boundary check in the original source, which prevents false positives like Card matching CardBody.

If those import forms are relevant for your rulesets, you might want to either:

  • Explicitly document that they’re out of scope for now, or
  • Extend the regex and (\w+) group to handle * as and $ identifiers (e.g. [A-Za-z0-9_$]+), and optionally add a small unit test matrix around findImportStatements.

Given the current targets (PatternFly-style imports), this is acceptable but good to be explicit.

Also applies to: 693-704

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a1377e5 and c4c672f.

⛔ Files ignored due to path filters (2)
  • external-providers/dotnet-external-provider/go.sum is excluded by !**/*.sum
  • external-providers/golang-dependency-provider/go.sum is excluded by !**/*.sum
📒 Files selected for processing (5)
  • external-providers/dotnet-external-provider/go.mod (2 hunks)
  • external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (7 hunks)
  • external-providers/golang-dependency-provider/go.mod (2 hunks)
  • lsp/base_service_client/base_service_client.go (1 hunks)
  • lsp/base_service_client/cmd_dialer.go (1 hunks)
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-07-25T02:05:02.884Z
Learnt from: shawn-hurley
Repo: konveyor/analyzer-lsp PR: 863
File: external-providers/generic-external-provider/go.mod:3-3
Timestamp: 2025-07-25T02:05:02.884Z
Learning: The `go` directive in go.mod only accepts major.minor (e.g., "go 1.23"); patch versions like "go 1.23.11" are invalid. To require a specific patch version, use the separate `toolchain` directive (introduced in Go 1.21), e.g., `toolchain go1.23.11`.

Applied to files:

  • external-providers/dotnet-external-provider/go.mod
📚 Learning: 2025-07-25T02:05:02.884Z
Learnt from: shawn-hurley
Repo: konveyor/analyzer-lsp PR: 863
File: external-providers/generic-external-provider/go.mod:3-3
Timestamp: 2025-07-25T02:05:02.884Z
Learning: Since Go 1.21, the go directive in go.mod files supports patch versions (e.g., "go 1.23.11"), not just major.minor format. The syntax "go 1.23.11" is valid and will not cause build failures in modern Go tooling.

Applied to files:

  • external-providers/dotnet-external-provider/go.mod
📚 Learning: 2025-10-21T16:25:40.903Z
Learnt from: shawn-hurley
Repo: konveyor/analyzer-lsp PR: 860
File: lsp/base_service_client/base_service_client.go:212-214
Timestamp: 2025-10-21T16:25:40.903Z
Learning: In `lsp/base_service_client/base_service_client.go`, when an RPC connection is provided via `c.RPC` (not nil), it is treated as already initialized and no handler setup is performed. This is known technical debt that needs to be addressed in the future to ensure LSP notifications like `textDocument/publishDiagnostics` are properly processed.

Applied to files:

  • external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go
🧬 Code graph analysis (1)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (2)
lsp/protocol/tsprotocol.go (4)
  • DefinitionParams (830-834)
  • TextDocumentPositionParams (4498-4503)
  • TextDocumentIdentifier (4477-4480)
  • Range (3559-3564)
provider/provider.go (4)
  • Position (324-342)
  • Location (319-322)
  • IncidentContext (309-317)
  • Dep (482-482)
🔇 Additional comments (12)
external-providers/dotnet-external-provider/go.mod (5)

24-24: OpenTelemetry versions consistently updated to v1.35.0.

The metric, core, SDK, and trace components are all aligned on the same version, which is appropriate.

Also applies to: 39-39, 41-41, 42-42


45-49: Verify gRPC v1.73.0 stability and compatibility.

Version v1.74.0 and v1.74.1 were retracted, and v1.73.0 changed behavior for grpc-timeout header handling. Ensure that this version is appropriate for your use case and that no downstream code depends on prior timeout handling.


25-25: Verify genproto and protobuf versions for gRPC compatibility.

The genproto date was updated to 2025-03 and protobuf to v1.36.6. Confirm these versions are compatible with gRPC v1.73.0 and that no breaking changes in the protobuf API affect your code.

Also applies to: 33-33


18-18: Confirm Microsoft/go-winio v0.6.2 is required.

This is a new indirect dependency. Verify that it's genuinely required by one of the direct dependencies and not an accidental transitive addition. If it's from Windows-specific code paths, ensure it doesn't introduce unexpected platform dependencies.

Also applies to: 12-12


53-53: Local replace directive for analyzer-lsp.

The replace directive points to the monorepo root. Ensure this is only for local development and remove it before merging to main if this is not already handled by CI/build tooling.

external-providers/golang-dependency-provider/go.mod (3)

8-8: Dependency versions align with dotnet-external-provider.

The Go and OpenTelemetry versions match across providers, ensuring consistency. This coordinated update approach is sound.

Also applies to: 25-25, 27-27, 28-28, 29-29


30-34: Foundation libraries updated to known-good versions.

golang.org/x/net v0.38.0 includes fixes for CVE-2024-45338, and the other stdlib extension versions are current. These are appropriate.


38-38: Match: Same replace directive as dotnet provider.

Consistent with the other provider's local development setup.

lsp/base_service_client/base_service_client.go (1)

361-374: Workspace/symbol logging and fallback behavior look solid

The new logging around workspace/symbol gives useful visibility on errors, result counts, and unsupported servers while keeping the existing fallback path intact. No functional regressions spotted here.

external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (3)

92-97: Initialization logging is helpful and low-risk

The added NodeJS LSP initialized log with root URI, workspace folders, server path, and name should be very useful for debugging misconfigurations, and it doesn’t change runtime behavior.


598-691: Multiline import normalization looks careful and safe

normalizeMultilineImports maintains a 1:1 character mapping (replacing newlines with spaces or newlines, but not inserting/removing characters), tracks brace depth, and skips over string literals. That keeps indices aligned with the original content so that FindAllStringSubmatchIndex results map correctly back to line/column.

The implementation is a bit involved but localized and well-commented; I don’t see correctness issues in the current logic.


706-717: NodeJS-specific dependency overrides correctly bypass base provider requirements

Overriding GetDependencies and GetDependenciesDAG to return empty maps avoids the base implementation’s "dependency provider path not set" error for NodeJS, which relies on LSP and import scanning instead of an external dependency provider. This matches the provider’s behavior and simplifies callers.

@tsanders-rh
Copy link
Contributor Author

Cleaned PatternFly Rules Demonstrating the Fix

I've prepared a cleaned version of the PatternFly v5→v6 migration ruleset that demonstrates how the import-based search eliminates the need for builtin.filecontent workarounds.

📦 Downloads

Cleaned Rules: patternfly-v5-to-v6-cleaned-rules.tar.gz (available at /tmp/patternfly-v5-to-v6-cleaned-rules.tar.gz for testing)

  • 21 YAML files (56 total rules)
  • 20 rules simplified (removed combo workarounds)
  • 8.4KB compressed

Detailed Comparison: https://gist.github.com/tsanders-rh/3a2b1b5e932cbd29c54f1f2fb0657c84

🎯 Key Improvements

Metric Before After Improvement
Lines per rule ~30 ~15 50% reduction
Conditions 2 (combo) 1 (semantic) Simpler
Performance LSP fail + regex Import scan + LSP Faster
Accuracy Text matching Semantic analysis More accurate

📋 Components Simplified (20 Total)

Badge, Button, Card, CardActions, CardBody, CardFooter, CardHeader, CardTitle, Checkbox, Chip, ChipGroup, Label, LabelGroup, Modal, ModalVariant, Popover, Radio, Select, TextInput, Toolbar

🧪 Testing

Tested against tackle2-ui codebase - all PatternFly components correctly detected with the cleaned rules.

📝 Example: Before vs After

BEFORE (Combo Rule - 28 lines):

- ruleID: patternfly-react-core-card-renamed
  when:
    and:
    - nodejs.referenced:
        pattern: Card
    - builtin.filecontent:
        pattern: import.*\{[^}]*\bCard\b[^}]*\}.*from ['"]@patternfly/react-core['"]
        filePattern: \.(j|t)sx?$
  message: |-
    The PatternFly Card component's import has changed.
    ...

AFTER (Clean - 14 lines):

- ruleID: patternfly-react-core-card-renamed
  when:
    nodejs.referenced:
      pattern: Card
  message: |-
    The PatternFly Card component's import has changed.
    ...

See the gist for full examples and detailed analysis.

Resolved conflict in nodejs/service_client.go by keeping the import-based
search implementation which supersedes the previous file batching optimization
from PR konveyor#970.

The import-based search provides better semantic accuracy by:
- Finding imports first (regex scan)
- Using LSP definitions to locate symbols
- Using LSP references to find all usages

This is fundamentally different from and more accurate than the workspace/symbol
approach that was being optimized in the upstream changes.

Signed-off-by: Todd Sanders <[email protected]>
Signed-off-by: tsanders <[email protected]>
@tsanders-rh
Copy link
Contributor Author

@pranavgaikwad - Decided to try an alternative approach. Runs the ruleset against tackle2-ui in 22 sec and catches the incidents without the combo builtin rules. Have a look and let me know what you think.

This commit adds test coverage for the nodejs.referenced import-based
search functionality introduced in the main PR.

Changes:
- Export ImportLocation struct fields (FileURI, LangID, Position, Line)
  to enable testing
- Add test helper functions to expose private methods for testing:
  - NormalizeMultilineImportsPublic()
  - FindImportStatementsPublic()
  - IsIdentifierCharPublic()
- Create comprehensive test suite (import_search_test.go) with 38 test
  cases covering:
  - Multiline import normalization (5 tests)
  - Import statement finding (8 tests)
  - Identifier character detection (12 tests)
  - Word boundary matching (6 tests)
  - Edge cases (7 tests)

All tests pass (100% success rate) and validate core import detection
logic including word boundaries, multiline imports, and various quote/
whitespace styles.

Signed-off-by: tsanders <[email protected]>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (1)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (1)

719-748: Test helpers expose implementation details.

While the public test wrappers enable thorough unit testing (which is good), they expose private implementation details. This is a common Go testing pattern but couples tests to internals.

Consider alternative approaches for future test additions:

  • Move helper functions to a separate internal testing package
  • Use table-driven integration tests that test the public EvaluateReferenced API end-to-end
  • If these helpers are genuinely useful for external consumers, document them properly and remove the "Public" suffix

The current approach is acceptable for this PR, but keep this consideration in mind for future changes.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cf83ae9 and fbf745f.

📒 Files selected for processing (2)
  • external-providers/generic-external-provider/pkg/server_configurations/nodejs/import_search_test.go (1 hunks)
  • external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (7 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-21T16:25:40.903Z
Learnt from: shawn-hurley
Repo: konveyor/analyzer-lsp PR: 860
File: lsp/base_service_client/base_service_client.go:212-214
Timestamp: 2025-10-21T16:25:40.903Z
Learning: In `lsp/base_service_client/base_service_client.go`, when an RPC connection is provided via `c.RPC` (not nil), it is treated as already initialized and no handler setup is performed. This is known technical debt that needs to be addressed in the future to ensure LSP notifications like `textDocument/publishDiagnostics` are properly processed.

Applied to files:

  • external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go
🧬 Code graph analysis (2)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/import_search_test.go (1)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (2)
  • NodeServiceClient (31-36)
  • FileInfo (722-725)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (2)
lsp/protocol/tsprotocol.go (4)
  • DefinitionParams (830-834)
  • TextDocumentPositionParams (4499-4504)
  • TextDocumentIdentifier (4478-4481)
  • Range (3559-3564)
provider/provider.go (4)
  • Position (324-342)
  • Location (319-322)
  • IncidentContext (309-317)
  • Dep (482-482)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: test (ubuntu-latest, linux, arm64)
  • GitHub Check: test (ubuntu-latest, linux, amd64)
  • GitHub Check: test (windows-latest, windows, amd64)
  • GitHub Check: test (macos-latest, darwin, arm64)
  • GitHub Check: test (macos-latest, darwin, amd64)
  • GitHub Check: test
  • GitHub Check: benchmark (macos-latest, mac)
  • GitHub Check: benchmark (ubuntu-latest, linux)
  • GitHub Check: benchmark (windows-latest, windows)
🔇 Additional comments (12)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (6)

92-97: LGTM: Helpful debug logging added.

The structured logging provides useful visibility into LSP initialization parameters.


133-145: LGTM: Clear struct definitions.

The ImportLocation and fileInfo structs are well-designed with appropriate fields and documentation.


147-400: Excellent algorithm design for import-based reference search.

The three-step approach (scan imports → resolve definitions → collect references) is well-architected and addresses the limitations of workspace/symbol search mentioned in the PR objectives. The deduplication logic via processedDefinitions correctly avoids redundant LSP calls, and the filtering ensures only workspace references are returned.


475-571: Solid import detection with appropriate word boundary checks.

The regex-based approach correctly handles named and default imports, and the word boundary validation (lines 546-549) prevents false matches on partial identifiers like "Card" matching "CardBody".

Note: TypeScript import type statements are not currently supported by this regex pattern. The test file acknowledges this limitation (lines 139-144 in import_search_test.go). This is acceptable if type-only imports are rare in target codebases, but consider documenting this limitation.

If type-only imports need to be supported in the future, consider extending the regex:

importRegex := regexp.MustCompile(
    `import\s+(?:type\s+)?(?:\{([^}]*)\}|(\w+))\s+from\s+['"]([^'"]+)['"]`,
)

694-704: LGTM: Clean utility functions.

Both isIdentifierChar and min are straightforward and correct.


706-717: LGTM: Appropriate override with clear documentation.

Returning empty dependency maps is correct for the nodejs provider, and the documentation clearly explains the rationale.

external-providers/generic-external-provider/pkg/server_configurations/nodejs/import_search_test.go (6)

12-68: LGTM: Comprehensive normalization tests.

The test cases cover key scenarios including multiline imports, nested structures, string literals, and comments. Well-structured table-driven tests.


71-195: LGTM: Thorough import detection tests with position verification.

Excellent coverage of import patterns including named, default, multiline, and mixed imports. The position verification (lines 186-191) ensures precision. The test case for type imports (lines 139-144) appropriately documents the current limitation.


198-230: LGTM: Complete coverage of identifier character validation.

Tests both valid and invalid characters comprehensively.


233-315: LGTM: Critical word boundary tests prevent false positives.

These tests are essential for ensuring the import search doesn't produce false matches. The coverage of exact, prefix, and suffix scenarios (e.g., "Card" vs "CardBody") validates that the word boundary logic in findImportStatements works correctly.


318-399: LGTM: Thorough edge case coverage with documented limitations.

The tests appropriately cover edge cases including empty files, various quote styles, and whitespace variations.

The acknowledged limitations (lines 345, 351) where imports in comments and strings are matched are acceptable. These false positives will result in extra LSP calls that won't find real definitions, but won't cause incorrect results. If these become performance bottlenecks in practice, they can be addressed in a future enhancement.


402-455: LGTM: Realistic benchmark for performance tracking.

The benchmark uses realistic React/PatternFly code patterns, making it useful for detecting performance regressions as the codebase evolves.

@tsanders-rh tsanders-rh force-pushed the fix/nodejs-referenced-import-search branch 3 times, most recently from f66148c to 1ed790f Compare November 20, 2025 12:26
This commit updates the demo baseline for violations/incidents to reflect
improved precision in the import-based search implementation for the
nodejs.referenced condition.

Changes:
- demo-output.yaml: Removed incorrect references to quarkus-1-6-2-jar-exploded
  files that don't exist in fresh builds. One fewer nodejs incident due to
  improved word boundary detection. The rule node-sample-rule-001 no longer
  matches file:///examples/nodejs/test_a.ts at line 5 because the improved
  detection correctly identifies that 'greeter' is not the same as 'greet'.
- demo-dep-output.yaml: Updated to include java-project dependencies that
  are correctly generated by fresh builds (matches main branch baseline).

Signed-off-by: tsanders <[email protected]>
@tsanders-rh tsanders-rh force-pushed the fix/nodejs-referenced-import-search branch from 1ed790f to 6eb0d54 Compare November 20, 2025 12:38
Accessing sc.Config.WorkspaceFolders[0] without verifying the slice is non-empty could cause a panic. This adds a defensive check that returns a clear error message if no workspace folders are configured.

Signed-off-by: $(git config user.name) <$(git config user.email)>
Signed-off-by: tsanders <[email protected]>
The 100ms sleep was a code smell relying on timing rather than proper
synchronization. This replaces it with retry logic that attempts the
textDocument/definition call up to 3 times with exponential backoff
(50ms, 100ms, 200ms). This approach is more robust across different
LSP server implementations and varying system loads.

Signed-off-by: $(git config user.name) <$(git config user.email)>
Signed-off-by: tsanders <[email protected]>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (3)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (3)

261-332: Hardcoded sleep per import is fragile and can be expensive

time.Sleep(100 * time.Millisecond) after each import location introduces a timing dependency on the LSP server and can add noticeable latency when there are many imports, especially since the sleep is outside the openedFiles guard and will execute even for files already opened earlier in the loop.

Consider tightening this in two ways:

  • Only sleep (or otherwise wait) when the file has just been opened, not for every import location.
  • Prefer a retry/backoff strategy keyed off definition results instead of a fixed delay (e.g., retry textDocument/definition a few times with small backoff until results appear or a timeout is hit).

This keeps the implementation robust across different LSP servers and project sizes.


343-381: Guard BaseConfig.WorkspaceFolders[0] to avoid potential panic and align with earlier checks

In the workspace‑filtering loop you index sc.BaseConfig.WorkspaceFolders[0] without a length check:

if !strings.Contains(ref.URI, sc.BaseConfig.WorkspaceFolders[0]) {
    continue
}

If BaseConfig.WorkspaceFolders is empty (e.g., misconfiguration, or divergence from Config.WorkspaceFolders), this will panic even though you already defensively validated sc.Config.WorkspaceFolders earlier in EvaluateReferenced.

Add a bounds check for BaseConfig.WorkspaceFolders before the loop and fail clearly if it's empty; you can mirror the earlier error message:

-	incidentsMap := make(map[string]provider.IncidentContext)
-	for _, ref := range allReferences {
+	incidentsMap := make(map[string]provider.IncidentContext)
+
+	if len(sc.BaseConfig.WorkspaceFolders) == 0 {
+		return resp{}, fmt.Errorf("no workspace folders configured in base config")
+	}
+
+	for _, ref := range allReferences {
 		// Only include references within the workspace
 		if !strings.Contains(ref.URI, sc.BaseConfig.WorkspaceFolders[0]) {
 			continue
 		}

It would also be worth checking whether Config.WorkspaceFolders and BaseConfig.WorkspaceFolders are guaranteed to stay in sync; if not, a short comment explaining why both are safe to use here would help future readers.


600-694: String escape and “from” detection are a bit brittle in import normalization

The overall approach in normalizeMultilineImports is sensible and the brace/string tracking is a nice balance between correctness and simplicity. There are a couple of edge cases to be aware of:

  • The escape check (content[i-1] != '\\') does not handle double‑escaped quotes like \\", so a quote preceded by an even number of backslashes may be incorrectly treated as closing the string. Counting consecutive backslashes and using parity would make this robust.
  • The check for "from" uses strings.Contains, which can match substrings inside identifiers or other tokens, potentially terminating an import when "from" appears in an unrelated context in the same region.
  • The “is this really an import keyword?” guard only inspects the character before "import"; it will still treat importX as an import statement if preceded by a non‑identifier character.

If you decide to tighten this, you could:

  • Replace the escape check with a short loop counting preceding \\ characters before deciding if the quote is escaped.
  • Use a small regex like \bfrom\s+['"] against the tail of recentContent to require from as a standalone keyword followed by a quote.
  • Optionally also check the character after "import" to ensure it’s not part of a longer identifier.

These changes would make the normalization more resilient without changing the overall design.

🧹 Nitpick comments (3)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (3)

182-215: Workspace scan and URI handling look good; consider cross‑platform URI construction

The workspace‑folder guard avoids panics and the filepath.Walk with node_modules skipping is solid. One subtle point: constructing URIs with "file://" + path (and trimming with strings.TrimPrefix) is fine on POSIX paths, but will produce non‑canonical URIs on Windows (e.g. file://C:\path). If Windows is a target, consider using uri.File(path) or filepath.ToSlash + proper file:/// formatting to construct URIs instead of manual string concatenation.


479-574: Import regex covers main cases but misses some common forms

The regex‑based findImportStatements implementation is clear and works well for the documented cases (named imports, default imports, multiline imports). A few limitations to be aware of:

  • It won’t match combined default + named imports like import React, { Button } from "..." or import Button, { ButtonProps } from "..." because the pattern only allows import {…} or import <ident> but not import <ident>, {…}.
  • It also skips type‑only imports such as import type { Button } from "...", which are increasingly common in TS.
  • The default‑import capture uses \w+, so identifiers containing $ or non‑ASCII letters won’t be matched as defaults.

Given the current PatternFly use case this is likely acceptable, but it would be good to either:

  • Extend the regex (or add additional regexes) to cover import Foo, { Bar } and import type forms, and/or
  • Add a brief comment/TODO noting that those forms are currently out of scope so future contributors aren’t surprised.

724-751: Test helper exports are fine; consider keeping them clearly test‑only

Exposing FileInfo, NormalizeMultilineImportsPublic, FindImportStatementsPublic, and IsIdentifierCharPublic purely for tests is a pragmatic approach, and the wrappers are thin. To avoid these becoming part of any broader public API surface over time, consider:

  • Adding a brief comment like “// Test‑only API – not for external use” above the block, or
  • If the package grows further, moving such helpers into a dedicated internal/testutil package.

Not a blocker, just a maintainability consideration.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6eb0d54 and 5b91e1a.

📒 Files selected for processing (1)
  • external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (7 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-21T16:25:40.903Z
Learnt from: shawn-hurley
Repo: konveyor/analyzer-lsp PR: 860
File: lsp/base_service_client/base_service_client.go:212-214
Timestamp: 2025-10-21T16:25:40.903Z
Learning: In `lsp/base_service_client/base_service_client.go`, when an RPC connection is provided via `c.RPC` (not nil), it is treated as already initialized and no handler setup is performed. This is known technical debt that needs to be addressed in the future to ensure LSP notifications like `textDocument/publishDiagnostics` are properly processed.

Applied to files:

  • external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go
🧬 Code graph analysis (1)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (2)
lsp/protocol/tsprotocol.go (4)
  • DefinitionParams (830-834)
  • TextDocumentPositionParams (4499-4504)
  • TextDocumentIdentifier (4478-4481)
  • Range (3559-3564)
provider/provider.go (4)
  • Position (324-342)
  • Location (319-322)
  • IncidentContext (309-317)
  • Dep (482-482)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: test
  • GitHub Check: test (windows-latest, windows, amd64)
  • GitHub Check: test (ubuntu-latest, linux, arm64)
  • GitHub Check: test (macos-latest, darwin, arm64)
  • GitHub Check: test (ubuntu-latest, linux, amd64)
  • GitHub Check: test (macos-latest, darwin, amd64)
  • GitHub Check: benchmark (windows-latest, windows)
  • GitHub Check: benchmark (macos-latest, mac)
  • GitHub Check: benchmark (ubuntu-latest, linux)
🔇 Additional comments (2)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (2)

92-97: Init logging is useful for diagnosing LSP setup

The added log around LSP initialization (root URI, workspace folders, server path/name) will be very helpful when debugging misconfigurations. No issues from my side; just be aware this is at Info level so it will appear in normal logs.


709-720: Dependency overrides are appropriate for nodejs provider

Overriding GetDependencies and GetDependenciesDAG to return empty maps is a clean way to avoid the base implementation’s “dependency provider path not set” error for this provider. Given that NodeJS here doesn’t use external dependency providers, this is a sensible and explicit choice.

Add defensive checks before accessing sc.BaseConfig.WorkspaceFolders[0]
at lines 360 and 431 to prevent potential runtime panics. These checks
ensure the slice is non-empty before accessing the first element.

Signed-off-by: $(git config user.name) <$(git config user.email)>
Signed-off-by: tsanders <[email protected]>
Replace simple backslash check with proper escape counting. The previous
check (content[i-1] != '\\') didn't correctly handle double-escaped quotes
like \\". This fix counts preceding backslashes - if there's an even number
(including 0), the quote is not escaped. This correctly handles cases like:
- "test\"" (escaped, odd backslashes)
- "test\\"" (not escaped, even backslashes)

Signed-off-by: $(git config user.name) <$(git config user.email)>
Signed-off-by: tsanders <[email protected]>
Replace strings.Contains check with regex using word boundaries to avoid
false positives. The previous check matched "from" anywhere in the string,
including inside identifiers (e.g., "dataFromAPI") or string literals.

The new regex pattern \bfrom\s+["'] ensures:
- \b matches word boundary (not part of a larger identifier)
- from matches the literal keyword
- \s+ matches one or more whitespace characters
- ["'] matches a quote character

The regex is compiled once at package level for efficiency.

Signed-off-by: $(git config user.name) <$(git config user.email)>
Signed-off-by: tsanders <[email protected]>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (2)

659-661: Escape sequence handling still needs improvement.

The escape check on line 659 (content[i-1] != '\\') doesn't correctly handle double-escaped quotes like \\". This edge case could cause incorrect string boundary detection.

Consider using a more robust escape detection approach:

 				} else if inString && ch == stringChar {
-					// Check for escape
-					if i > 0 && content[i-1] != '\\' {
+					// Count preceding backslashes to determine if quote is escaped
+					escapeCount := 0
+					for j := i - 1; j >= 0 && content[j] == '\\'; j-- {
+						escapeCount++
+					}
+					// If even number of backslashes (including 0), quote is not escaped
+					if escapeCount%2 == 0 {
 						inString = false
 					}

Based on past review comments.


686-688: "from" keyword detection could produce false positives.

The check for the "from" keyword (line 687) uses strings.Contains, which could match "from" inside identifiers (e.g., datafromAPI) or as part of larger words within the recent content.

Consider a more precise check using word boundaries:

 						if len(recentContent) > 10 {
 							last50 := recentContent[len(recentContent)-min(50, len(recentContent)):]
-							if strings.Contains(last50, "from") && (strings.Contains(last50, "\"") || strings.Contains(last50, "'")) {
+							// Match "from" as a standalone word followed by a quote
+							fromRegex := regexp.MustCompile(`\bfrom\s+['"]`)
+							if fromRegex.MatchString(last50) {
 								// Import statement is complete

Based on past review comments.

🧹 Nitpick comments (2)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (2)

360-365: Consider hoisting the workspace folders validation.

While the bounds check correctly addresses the previous review feedback, checking len(sc.BaseConfig.WorkspaceFolders) == 0 inside the loop is inefficient. Consider validating once at the function start (similar to lines 183-185) and returning early if empty.

Apply this refactor to validate once at function start:

 	// Filter references to workspace only and convert to incidents
+	if len(sc.BaseConfig.WorkspaceFolders) == 0 {
+		sc.Log.V(1).Info("No workspace folders configured in BaseConfig, cannot filter references")
+		return resp{Matched: false}, nil
+	}
+	
 	incidentsMap := make(map[string]provider.IncidentContext)
 	for _, ref := range allReferences {
 		// Only include references within the workspace
-		if len(sc.BaseConfig.WorkspaceFolders) == 0 {
-			continue
-		}
 		if !strings.Contains(ref.URI, sc.BaseConfig.WorkspaceFolders[0]) {
 			continue
 		}

396-397: Consider a more efficient deduplication key.

Using json.Marshal to generate deduplication keys (lines 396-397) is functional but inefficient due to marshalling overhead. Consider constructing a simple string key from the incident's unique fields instead.

Apply this refactor for better performance:

 		lineNumber := int(ref.Range.Start.Line)
 		incident := provider.IncidentContext{
 			FileURI:    u,
 			LineNumber: &lineNumber,
 			Variables: map[string]interface{}{
 				"file": ref.URI,
 			},
 			CodeLocation: &provider.Location{
 				StartPosition: provider.Position{Line: float64(lineNumber)},
 				EndPosition:   provider.Position{Line: float64(lineNumber)},
 			},
 		}
-		b, _ := json.Marshal(incident)
-		incidentsMap[string(b)] = incident
+		// Use file URI and line number as deduplication key
+		dedupKey := fmt.Sprintf("%s:%d", ref.URI, lineNumber)
+		incidentsMap[dedupKey] = incident
 	}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5b91e1a and 9f811ef.

📒 Files selected for processing (1)
  • external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (8 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-21T16:25:40.903Z
Learnt from: shawn-hurley
Repo: konveyor/analyzer-lsp PR: 860
File: lsp/base_service_client/base_service_client.go:212-214
Timestamp: 2025-10-21T16:25:40.903Z
Learning: In `lsp/base_service_client/base_service_client.go`, when an RPC connection is provided via `c.RPC` (not nil), it is treated as already initialized and no handler setup is performed. This is known technical debt that needs to be addressed in the future to ensure LSP notifications like `textDocument/publishDiagnostics` are properly processed.

Applied to files:

  • external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go
🧬 Code graph analysis (1)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (2)
lsp/protocol/tsprotocol.go (4)
  • DefinitionParams (830-834)
  • TextDocumentPositionParams (4499-4504)
  • TextDocumentIdentifier (4478-4481)
  • Range (3559-3564)
provider/provider.go (4)
  • Position (324-342)
  • Location (319-322)
  • IncidentContext (309-317)
  • Dep (482-482)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: test
  • GitHub Check: benchmark (macos-latest, mac)
  • GitHub Check: test (macos-latest, darwin, amd64)
  • GitHub Check: test (ubuntu-latest, linux, arm64)
  • GitHub Check: benchmark (windows-latest, windows)
  • GitHub Check: test (windows-latest, windows, amd64)
  • GitHub Check: test (ubuntu-latest, linux, amd64)
  • GitHub Check: test (macos-latest, darwin, arm64)
🔇 Additional comments (11)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (11)

133-145: LGTM! Well-designed types for import tracking.

The ImportLocation struct provides all necessary context for tracking where symbols are imported, and the separation between public (ImportLocation) and internal (fileInfo) types is appropriate for the API surface.


92-97: LGTM! Helpful initialization logging.

These debug logs will aid in troubleshooting LSP configuration issues without exposing sensitive data.


183-185: LGTM! Bounds check properly addresses previous feedback.

The defensive check prevents panic when workspace folders are not configured.


298-311: LGTM! Retry logic with exponential backoff properly addresses previous feedback.

The retry mechanism (50ms, 100ms, 200ms) handles LSP indexing delays gracefully and includes a clear comment explaining the rationale.


188-221: LGTM! Efficient filesystem traversal.

The walk correctly skips node_modules and handles both TypeScript and JavaScript variants (.ts, .tsx, .js, .jsx) with appropriate language IDs.


569-570: LGTM! Correct word boundary verification.

The boundary checks safely handle edge cases at the start and end of the string, and the use of isIdentifierChar ensures accurate pattern matching without false positives from partial matches.


717-719: LGTM! Adequate identifier character check for import parsing.

The function correctly identifies ASCII identifier characters. Note that JavaScript identifiers can technically include Unicode characters per ECMAScript spec, but limiting to ASCII is reasonable for import statement parsing and avoids complexity.


722-727: LGTM! Simple and correct min implementation.

Avoiding math.Min (which works with floats) is appropriate for this integer use case.


732-740: LGTM! Appropriate stub implementations.

Returning empty maps is the correct approach for the nodejs provider, which handles dependencies internally rather than via external providers. The comments clearly document this design choice.


744-771: LGTM! Well-structured test helpers.

The public wrappers and type conversions correctly expose internal functionality for unit testing while maintaining appropriate encapsulation. The FileInfo struct provides a clean public API for test consumers.


501-503: The regex limitation review is unnecessary given the codebase usage and function purpose.

The function explicitly documents its intent: "Captures both named imports and default imports." Verification confirms the codebase contains neither namespace imports nor type-only imports. Since the function's purpose is LSP symbol definition lookup (not comprehensive import parsing), side-effect imports are not relevant—they don't import named symbols. The current regex correctly handles the import patterns that actually exist and are needed for this use case.

Likely an incorrect or invalid review comment.

Fix critical consistency bug where Config.WorkspaceFolders was initialized
at line 58 but BaseConfig.WorkspaceFolders was never synchronized. This
caused lines 363 and 434 to use an uninitialized BaseConfig.WorkspaceFolders,
leading to potential silent failures in reference filtering logic.

After initializing the base client, explicitly synchronize
BaseConfig.WorkspaceFolders from Config.WorkspaceFolders to ensure both
configurations remain consistent throughout the code.

Signed-off-by: $(git config user.name) <$(git config user.email)>
Signed-off-by: tsanders <[email protected]>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (4)

195-227: Use URI parsing instead of TrimPrefix("file://") and consider reusing the file list

Deriving folder with strings.TrimPrefix(sc.Config.WorkspaceFolders[0], "file://") is a bit brittle and may misbehave for Windows-style URIs (e.g. file:///C:/...). It’s safer to parse the URI and ask it for a filesystem path:

-	folder := strings.TrimPrefix(sc.Config.WorkspaceFolders[0], "file://")
+	workspaceURI := sc.Config.WorkspaceFolders[0]
+	u, err := uri.Parse(workspaceURI)
+	if err != nil {
+		return resp{}, fmt.Errorf("invalid workspace folder URI %q: %w", workspaceURI, err)
+	}
+	folder := u.Filename()

Also, filepath.Walk + re-reading all TS/JS files runs on every EvaluateReferenced call; if you end up with many nodejs.referenced rules per workspace, consider caching nodeFiles per workspace root (or per process) to avoid repeated full-tree walks.

Also applies to: 236-252


369-423: Workspace/dependency filtering currently assumes a single workspace root

The incident construction correctly filters out references not under WorkspaceFolders[0] and then excludes any path containing a configured dependency folder. This works, but it effectively ignores additional workspace roots if they are ever configured.

If you expect multi-root workspaces in future, consider iterating all workspace folders instead of only index 0:

-		if !strings.Contains(ref.URI, sc.BaseConfig.WorkspaceFolders[0]) {
-			continue
-		}
+		inWorkspace := false
+		for _, wf := range sc.BaseConfig.WorkspaceFolders {
+			if wf != "" && strings.Contains(ref.URI, wf) {
+				inWorkspace = true
+				break
+			}
+		}
+		if !inWorkspace {
+			continue
+		}

Same consideration applies to the earlier file scan, which also uses only WorkspaceFolders[0].
Based on learnings


512-607: Minor import-scanning refinements: precompile regex and broaden identifier support

The overall approach in findImportStatements is solid, but a couple of small tweaks would make it more robust and efficient:

  1. Precompile the import regex once at package scope (similar to fromKeywordRegex) instead of on every call:
-var (
-	// fromKeywordRegex matches "from" as a standalone keyword followed by a quote
-	fromKeywordRegex = regexp.MustCompile(`\bfrom\s+["']`)
-)
+var (
+	// fromKeywordRegex matches "from" as a standalone keyword followed by a quote
+	fromKeywordRegex = regexp.MustCompile(`\bfrom\s+["']`)
+
+	// importStatementRegex matches named/default imports: import { X } from "pkg" / import X from "pkg"
+	importStatementRegex = regexp.MustCompile(
+		`import\s+(?:\{([^}]*)\}|([A-Za-z0-9_$]+))\s+from\s+['"]([^'"]+)['"]`,
+	)
+)

and then:

-	importRegex := regexp.MustCompile(
-		`import\s+(?:\{([^}]*)\}|(\w+))\s+from\s+['"]([^'"]+)['"]`,
-	)
...
-	allMatches := importRegex.FindAllStringSubmatchIndex(normalized, -1)
+	allMatches := importStatementRegex.FindAllStringSubmatchIndex(normalized, -1)
  1. Default import identifiers: using [A-Za-z0-9_$]+ (or aligning with isIdentifierChar) ensures default imports that contain $ are matched as well.

These are non-breaking polish items; the existing logic already handles the PatternFly-style imports described in the PR.


763-790: Test helper exports are fine; consider a more specific FileInfo name

Exposing NormalizeMultilineImportsPublic, FindImportStatementsPublic, and IsIdentifierCharPublic for unit tests is a reasonable pattern, and the adapter from FileInfo to internal fileInfo keeps the production surface small.

One minor nit: naming this exported helper type FileInfo in the same package where os.FileInfo is in use can be slightly confusing at call sites. If you touch this again, consider a more specific name like ImportSearchFileInfo to make the purpose obvious.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9f811ef and db79563.

📒 Files selected for processing (1)
  • external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (9 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-21T16:25:40.903Z
Learnt from: shawn-hurley
Repo: konveyor/analyzer-lsp PR: 860
File: lsp/base_service_client/base_service_client.go:212-214
Timestamp: 2025-10-21T16:25:40.903Z
Learning: In `lsp/base_service_client/base_service_client.go`, when an RPC connection is provided via `c.RPC` (not nil), it is treated as already initialized and no handler setup is performed. This is known technical debt that needs to be addressed in the future to ensure LSP notifications like `textDocument/publishDiagnostics` are properly processed.

Applied to files:

  • external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go
🧬 Code graph analysis (1)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (2)
provider/provider.go (4)
  • Position (324-342)
  • Location (319-322)
  • IncidentContext (309-317)
  • Dep (482-482)
lsp/protocol/tsprotocol.go (4)
  • DefinitionParams (830-834)
  • TextDocumentPositionParams (4499-4504)
  • TextDocumentIdentifier (4478-4481)
  • Range (3559-3564)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: test
  • GitHub Check: test (macos-latest, darwin, amd64)
  • GitHub Check: benchmark (windows-latest, windows)
  • GitHub Check: benchmark (macos-latest, mac)
  • GitHub Check: benchmark (ubuntu-latest, linux)
  • GitHub Check: test (windows-latest, windows, amd64)
  • GitHub Check: test (ubuntu-latest, linux, amd64)
  • GitHub Check: test (ubuntu-latest, linux, arm64)
  • GitHub Check: test (macos-latest, darwin, arm64)
🔇 Additional comments (4)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (4)

101-110: Syncing BaseConfig.WorkspaceFolders fixes earlier workspace-filtering bug

Copying sc.Config.WorkspaceFolders into sc.BaseConfig.WorkspaceFolders right after base initialization resolves the previous inconsistency where BaseConfig.WorkspaceFolders was never set but later used for filtering in EvaluateReferenced/EvaluateSymbols. The added init log is also useful for debugging startup issues.
Based on learnings


274-359: Definition/reference pipeline and file lifecycle look correct

The loop over importLocations cleanly avoids re-opening files (openedFiles), uses a bounded retry with backoff for textDocument/definition, deduplicates definitions via processedDefinitions, and aggregates references before closing all opened documents. The logging around failures is at a reasonable verbosity and the early-continue behavior keeps failures for one import from affecting others.

Also applies to: 361-367


444-446: Bounds check on BaseConfig.WorkspaceFolders removes panic risk

Adding the len(sc.BaseConfig.WorkspaceFolders) == 0 guard here brings EvaluateSymbols in line with the other call sites and prevents an index-out-of-range panic when no workspace folders are configured.
Based on learnings


748-759: Dependency overrides cleanly short-circuit unused dependency mechanisms

Overriding GetDependencies/GetDependenciesDAG to return empty maps for the NodeJS provider is a clear way to avoid the base implementation’s “dependency provider path not set” error and matches the provider’s actual needs. No issues here.

…erference

In normalizeMultilineImports, the "from" detection on newlines was using
result.String() which scans the entire accumulated file content. This could
match "from" tokens from previous imports or other code that falls within
the 50-character window, causing the current import to be prematurely
treated as complete.

Now restrict the check to content[importStart:i] to inspect only the
current import statement. This avoids spurious matches from earlier code
and eliminates the O(n²) behavior from repeatedly calling result.String().

Signed-off-by: $(git config user.name) <$(git config user.email)>
Signed-off-by: tsanders <[email protected]>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (1)

510-512: Consider supporting combined default and named imports.

The current regex handles either named imports import { X } from "..." or default imports import X from "...", but does not capture combined patterns like import X, { Y } from "...". While this may be acceptable for the current use case, combined imports are valid JavaScript/TypeScript syntax.

If supporting combined imports is needed, consider updating the regex to:

-	importRegex := regexp.MustCompile(
-		`import\s+(?:\{([^}]*)\}|(\w+))\s+from\s+['"]([^'"]+)['"]`,
-	)
+	importRegex := regexp.MustCompile(
+		`import\s+(?:(\w+)\s*,\s*)?\{([^}]*)\}|(\w+)\s+from\s+['"]([^'"]+)['"]`,
+	)

Then adjust the group indices accordingly in the matching logic below.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between db79563 and c777cfb.

📒 Files selected for processing (1)
  • external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (9 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-21T16:25:40.903Z
Learnt from: shawn-hurley
Repo: konveyor/analyzer-lsp PR: 860
File: lsp/base_service_client/base_service_client.go:212-214
Timestamp: 2025-10-21T16:25:40.903Z
Learning: In `lsp/base_service_client/base_service_client.go`, when an RPC connection is provided via `c.RPC` (not nil), it is treated as already initialized and no handler setup is performed. This is known technical debt that needs to be addressed in the future to ensure LSP notifications like `textDocument/publishDiagnostics` are properly processed.

Applied to files:

  • external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go
🧬 Code graph analysis (1)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (2)
provider/provider.go (4)
  • Position (324-342)
  • Location (319-322)
  • IncidentContext (309-317)
  • Dep (482-482)
lsp/protocol/tsprotocol.go (4)
  • DefinitionParams (830-834)
  • TextDocumentPositionParams (4499-4504)
  • TextDocumentIdentifier (4478-4481)
  • Range (3559-3564)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: test (ubuntu-latest, linux, amd64)
  • GitHub Check: test (macos-latest, darwin, arm64)
  • GitHub Check: test (macos-latest, darwin, amd64)
  • GitHub Check: test (ubuntu-latest, linux, arm64)
  • GitHub Check: test (windows-latest, windows, amd64)
  • GitHub Check: test
  • GitHub Check: benchmark (macos-latest, mac)
  • GitHub Check: benchmark (ubuntu-latest, linux)
  • GitHub Check: benchmark (windows-latest, windows)
🔇 Additional comments (8)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (8)

22-25: LGTM!

The package-level regex compilation is appropriate and efficient, avoiding repeated compilation during runtime.


97-106: LGTM!

The synchronization of BaseConfig.WorkspaceFolders with Config.WorkspaceFolders correctly addresses the consistency issue raised in previous reviews. The debug logging provides useful diagnostics.


142-154: LGTM!

The type definitions appropriately separate public API (ImportLocation) from internal implementation details (fileInfo).


307-320: LGTM!

The retry logic with exponential backoff (50ms, 100ms, 200ms) properly addresses the LSP indexing delay concern raised in previous reviews. This is a robust solution that avoids the fragility of a fixed sleep.


666-675: LGTM!

The escape sequence handling correctly counts consecutive backslashes to determine if a quote is escaped (odd count = escaped, even count = not escaped). This addresses the edge case of double-escaped quotes like \\" that was raised in previous reviews.


696-710: LGTM!

The "from" keyword detection is now properly scoped to the current import statement (using a snippet from importStart to current position) and uses a precise regex with word boundaries. This addresses the false positive concerns from previous reviews.


732-743: LGTM!

The utility functions are straightforward and correct. The isIdentifierChar function appropriately checks for valid JavaScript/TypeScript identifier characters including $ and _.


745-787: LGTM!

The stub methods correctly document that the nodejs provider doesn't use external dependency providers. The test helper pattern of exposing private methods through public wrappers is appropriate and maintains good encapsulation.

@tsanders-rh
Copy link
Contributor Author

new.nodejs.provider.mov

Copy link
Contributor

@shawn-hurley shawn-hurley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Primary considerations that I think are important the rest of the review is nits.

  1. Handling the '*' import cases.
  2. I think the normalize function could use helper methods instead of iterating on characters.
  3. The look ahead in a loop, where we would already be looping through that I think can be simplified.
  4. Instead of using '/r' or '/n' directly use os.LineSeperator.
  5. Public methods vs private methods.

I would like to also consider, that this now removes the ability to search for things that are not imported in rules. I think that might be an issue for certain situations, and I think we should have a plan for adding that back if we do go down this path.

unmatched:
- file-002
- lang-ref-002
- node-sample-rule-002
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IDK if all of the rule-example rules have this, but can we add a comment explaining why this rule should be unmatched? It helps when we come back later and wonder why it is unmatched and if that was a issue :)

@tsanders-rh
Copy link
Contributor Author

@shawn-hurley - Thank you for the super thorough and thoughtful review. Lots of really great feedback and things to consider. I'll start to address your important points one a a time and we can work through.

Extends import detection to handle all valid JavaScript/TypeScript patterns:
- Namespace imports: import * as Pattern from "package"
- Mixed default + named: import React, { useState } from "react"
- Mixed default + namespace: import React, * as All from "react"
- TypeScript type imports: import type { Card } from "package"

Updated regex pattern with 5 capture groups:
1. Default import (when mixed with named/namespace)
2. Named imports (braced list)
3. Default import (when standalone)
4. Namespace import
5. Package name

The regex now includes optional "type" keyword support for TypeScript
type-only imports, ensuring comprehensive coverage of modern JavaScript
and TypeScript import patterns.

Pattern matching logic updated to:
- Check all three import types (default, named, namespace)
- Use exact match for single identifiers (default, namespace)
- Use substring match for comma-separated lists (named imports)
- Added inline comments explaining matching strategy

Test coverage (25 test cases):
- Namespace imports (single line, multiline, multiple per file)
- Mixed default + named imports (both parts searchable, multiline)
- Mixed default + namespace imports (rare but valid)
- TypeScript type imports (named, default, namespace, multiline)
- Side-effect imports (correctly ignored)
- Word boundaries (avoid partial matches like "Card" vs "CardHelper")
- Special characters in package names (@scoped/packages)
- Stress test with 20 named imports in single statement
- Pattern not found scenarios

All 25 tests passing. Build successful.

Addresses code review feedback item #1.

Signed-off-by: tsanders <[email protected]>
Copy link
Contributor

@pranavgaikwad pranavgaikwad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some concerns about textDocument/references call and running a text search twice:

  1. I am seeing that running a textDocument/references call for a location is very slow. In my work of adding Prepare() function, I improved this search by getting symbols only once using textDocument/documentSymbol per URI. That is a much faster way IMO.
  2. The base service client already runs a text search, then the nodejs does another text search just for import locations. I wonder if something like this should be coupled with Prepare() implementation where we run the text search only once per file and do all queries at once.

@tsanders-rh
Copy link
Contributor Author


I've addressed your feedback on import pattern handling.

Changes Made

Extended import detection to support all valid JavaScript/TypeScript patterns:

  1. Namespace imports:
    import * as PatternFly from '@patternfly/react-core';
  2. Mixed default + named imports:
    import React, { useState, useEffect } from 'react';
  3. Mixed default + namespace imports (rare but valid):
    import React, * as ReactAll from 'react';
  4. TypeScript type-only imports:
    import type { Card } from '@patternfly/react-core';
    import type * as Types from './types';

Implementation Details

Updated regex pattern:
import\s+(?:type\s+)?(?:(\w+)\s*,\s*)?(?:\{([^}]*)\}|(\w+)|\*\s+as\s+(\w+))\s+from\s+['"]([^'"]+)['"]

This now uses 5 capture groups to handle all import combinations:

  1. Default import (when mixed with named/namespace)
  2. Named imports (braced list)
  3. Default import (when standalone)
  4. Namespace import
  5. Package name

Pattern matching logic updated to:

  • Use exact match (==) for single identifiers (default, namespace)
  • Use substring match (Contains) for comma-separated lists (named imports)
  • Added inline comments explaining this distinction

Test Coverage

Added 25 comprehensive test cases covering:

  • ✅ Namespace imports (single line, multiline, multiple per file)
  • ✅ Mixed default + named (both parts searchable, multiline)
  • ✅ Mixed default + namespace
  • ✅ TypeScript type imports (all variants)
  • ✅ Side-effect imports (correctly ignored)
  • ✅ Word boundaries (avoid "Card" matching "CardHelper")
  • ✅ Special characters in package names (@scoped/packages)
  • ✅ Stress test with 20 named imports in one statement

All tests passing. Build successful.

Bonus: TypeScript Support

While investigating, I noticed we weren't handling TypeScript's type keyword for type-only imports. Since tackle2-ui and PatternFly are TypeScript
codebases, I added support for this pattern with minimal complexity (just (?:type\s+)? prefix in the regex).


Let me know if you'd like any additional test coverage or if there are other import patterns I should consider!

Commit: e85cc89

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/import_search_test.go (1)

70-371: findImportStatements tests are comprehensive but miss a subtle prefix-collision case.

The table covers default/named/multiline, namespace, mixed, type‑only, and side‑effect imports with line/column assertions, which is excellent. One edge case that would be useful to capture is a named list where the pattern is both a standalone symbol and a prefix of another symbol on the same line, e.g. import { CardFooter, Card } from '...' with pattern "Card"—this currently relies on the implementation correctly scanning past the first non‑word‑boundary hit.

Do you want to add a test like that once the underlying implementation is adjusted?

external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (2)

191-230: Workspace scan logic is straightforward but walks the entire tree.

Using filepath.Walk with extension checks and skipping node_modules is clear and correct for TS/JS discovery. Just be aware that very large workspaces (e.g., including dist or build trees) could make EvaluateReferenced slower; if that becomes an issue later, you may want to consider reusing an existing source-file indexer or filtering by known source roots.


787-816: Testing helpers are pragmatic, though they slightly expand the public surface.

Exposing FileInfo plus NormalizeMultilineImportsPublic, FindImportStatementsPublic, and IsIdentifierCharPublic for tests is a reasonable tradeoff given the nodejs_test package. If you ever want to hide these again, you could move tests into the nodejs package and call the private helpers directly.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c777cfb and e85cc89.

📒 Files selected for processing (2)
  • external-providers/generic-external-provider/pkg/server_configurations/nodejs/import_search_test.go (1 hunks)
  • external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (9 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-21T16:25:40.903Z
Learnt from: shawn-hurley
Repo: konveyor/analyzer-lsp PR: 860
File: lsp/base_service_client/base_service_client.go:212-214
Timestamp: 2025-10-21T16:25:40.903Z
Learning: In `lsp/base_service_client/base_service_client.go`, when an RPC connection is provided via `c.RPC` (not nil), it is treated as already initialized and no handler setup is performed. This is known technical debt that needs to be addressed in the future to ensure LSP notifications like `textDocument/publishDiagnostics` are properly processed.

Applied to files:

  • external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go
🧬 Code graph analysis (2)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/import_search_test.go (1)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (2)
  • NodeServiceClient (36-41)
  • FileInfo (790-793)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (2)
provider/provider.go (4)
  • Position (324-342)
  • Location (319-322)
  • IncidentContext (309-317)
  • Dep (482-482)
lsp/protocol/tsprotocol.go (4)
  • DefinitionParams (830-834)
  • TextDocumentPositionParams (4499-4504)
  • TextDocumentIdentifier (4478-4481)
  • Range (3559-3564)
🔇 Additional comments (14)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/import_search_test.go (5)

11-68: Good coverage for multiline import normalization.

The cases exercise single‑line, multiline, type imports, embedded newlines in strings, and comments; this is a solid regression suite for normalizeMultilineImports.


373-406: Identifier-char tests look correct and aligned with implementation.

The table nicely covers allowed vs disallowed characters for JavaScript identifiers (a, Z, digits, _, $ vs whitespace/punctuation). This matches the current isIdentifierChar helper behavior.


408-491: Word‑boundary tests correctly pin the intended semantics.

These cases ensure exact matches, reject prefix/suffix substrings, and validate behavior across multiple symbols in a list. This is a good contract for the search’s word‑boundary handling.


493-575: Edge‑case tests document current limitations clearly.

Using expectations of 1 for “import in comment/string” is a good way to encode the accepted limitation that the regex doesn’t filter comments/strings, while still checking other whitespace/quote variants.


577-631: Benchmark is well-structured and representative.

The benchmark uses a realistic TSX snippet, isolates setup from timing via b.ResetTimer(), and calls the search function in the loop without extra allocations. This is appropriate for tracking performance of the import scan.

external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (9)

22-25: Precompiled fromKeywordRegex is a good improvement.

Hoisting the “from” detection regex into a package‑level var avoids repeated compilation inside the hot normalization path and makes the intent clear.


97-107: WorkspaceFolders synchronization and debug logging look correct.

Copying sc.Config.WorkspaceFolders into sc.BaseConfig.WorkspaceFolders here resolves the earlier inconsistency and ensures downstream filters use the same root. The added initialization log with root URI, workspace folders, and LSP server settings should be very helpful operationally.


142-154: New ImportLocation/fileInfo types are reasonable.

Separating the exported ImportLocation (URI, lang ID, position, line) from the internal fileInfo simplifies the import search pipeline and keeps the internal representation unexported.


232-355: Import → definition → references pipeline is well-structured.

The flow—fast import scan, per‑file LSP didOpen, definition lookup with bounded exponential backoff, deduped definition processing, and reference aggregation—is clear and robust. The retry loop (50/100/200ms) is a reasonable compromise for indexing delays, and the additional logging around definitions and references should make debugging much easier.


357-407: Reference filtering and incident construction are consistent with existing patterns.

The workspace and dependency filters reuse BaseConfig.WorkspaceFolders/DependencyFolders (now synchronized with Config), and incidents are deduped via JSON keys with line‑based locations, matching the style in EvaluateSymbols. This looks correct.


414-428: Final summary logging and Matched/Incidents behavior are appropriate.

The summary log (imports found, definitions processed, references, incidents) plus the Matched=false short path when no incidents give good observability without changing the external contract of nodejs.referenced.


634-759: Multiline-import normalization logic is solid and matches tests.

The scanner correctly:

  • Avoids matching import inside larger identifiers.
  • Tracks brace depth and strings (including escaped quotes with backslash parity).
  • Replaces newlines with spaces only until the current import is complete using fromKeywordRegex scoped to the current statement.

The edge cases encoded in tests (strings with \n, comments, long imports) are all addressed here.


761-772: isIdentifierChar/min helpers are simple and correct.

isIdentifierChar matches the JS identifier subset you rely on elsewhere, and min is a tiny local utility used only in normalization; both are straightforward.


774-785: GetDependencies/GetDependenciesDAG overrides correctly neutralize dependency-provider behavior.

Returning empty maps here avoids spurious “dependency provider path not set” errors for the NodeJS provider while keeping the signatures compatible with the base client.

Fixes a critical bug where import detection would miss valid matches
when the pattern appeared as a prefix of another symbol on the same line.

Example that was failing:
  import { CardFooter, Card } from '@patternfly/react-core';

When searching for "Card":
- Previous: Would find "Card" in "CardFooter", fail word boundary check,
  and stop searching - missing the standalone "Card" later on the line
- Fixed: Now searches all occurrences on the line until finding a match
  with valid word boundaries

Changes:
- Added inner loop to check multiple occurrences of pattern per line
- Added `found` flag to break from outer line-searching loop once matched
- Prevents false negatives when pattern is substring of other identifiers

Test coverage:
- Pattern as prefix: "CardFooter, Card" - now correctly finds "Card"
- Pattern in middle: "Button, Card, CardBody" - correctly finds "Card"

All 27 tests passing.

Addresses CodeRabbit review feedback.

Signed-off-by: tsanders <[email protected]>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (1)

796-825: Public test helpers expand the API surface; consider moving tests into the same package.

FileInfo, NormalizeMultilineImportsPublic, FindImportStatementsPublic, and IsIdentifierCharPublic are exported only to support tests in package nodejs_test. That makes internal implementation details part of the public API, which is a Go anti‑pattern and creates long‑term compatibility obligations.

If you don’t strictly need black‑box tests, consider moving import_search_test.go into package nodejs and calling the unexported helpers (fileInfo, normalizeMultilineImports, findImportStatements, isIdentifierChar) directly, then removing these public wrappers and the exported FileInfo. This would keep the external API smaller and better aligned with the provider’s intended surface.

🧹 Nitpick comments (7)
test-patternfly-direct.sh (3)

70-73: Avoid ls | grep for RULE_COUNT (SC2010).

The RULE_COUNT pipeline (ls ... | grep -v ... | wc -l) is fragile and flagged by shellcheck. Prefer a glob or find-based approach that handles arbitrary filenames and avoids parsing ls output, e.g.:

RULE_COUNT=$(
  find "${RULESET_DIR}" -maxdepth 1 -type f -name '*.yaml' ! -name 'ruleset.yaml' 2>/dev/null |
    wc -l | tr -d ' '
)

This removes the subshell warning and is more robust with odd filenames.


23-29: Consider parameterizing local paths for wider reusability.

The script hard-codes user-specific paths (/Users/tsanders/...) and a Homebrew-specific typescript-language-server path, which limits reuse by other developers or CI.

Consider making these configurable via environment variables or script arguments (with sensible defaults), e.g., ANALYZER_LSP_DIR, TACKLE2_UI_DIR, and LSP_SERVER_PATH, so others can run the test without editing the script.

Also applies to: 90-107


146-151: Optional: Add a trap to always clean up the provider.

Right now the provider is killed on explicit failure paths and at the end of the script, but not if the script is interrupted (Ctrl‑C) or exits unexpectedly somewhere else.

Consider adding a simple trap on EXIT/INT that kills ${PROVIDER_PID} if set, to avoid leaving a stray provider listening on ${PROVIDER_PORT}.

Also applies to: 193-201, 210-211

CODE_REVIEW_RESPONSE_PLAN.md (1)

41-63: Keep the plan in sync with the implemented regex and test strategy.

The document still describes an import regex with extra namespace capture groups and helper-based normalization, and it proposes removing public test-only wrappers. The current implementation uses a slightly different 5‑group regex, inline helper logic in normalizeMultilineImports, and keeps the public wrappers (FileInfo, NormalizeMultilineImportsPublic, etc.).

To avoid confusion for future readers, either update this plan to reflect the final design (regex shape, helpers, and decision to use external tests with wrappers) or adjust the code to match the documented plan.

Also applies to: 348-385

NODEJS_REFERENCED_ANALYSIS.md (1)

398-406: Align RCA “Recommended Approach” with the new import-based implementation.

This RCA correctly diagnoses the workspace/symbol limitations and proposes an import‑based definition + references flow as a future solution, but the current code now implements that algorithm in EvaluateReferenced. It would help future readers if the “Recommended Approach” and comparison sections explicitly note that the import-based strategy is now implemented and describe when to still fall back to builtin.filecontent.

While you’re touching the doc, you may also want to add languages to fenced code blocks and de‑duplicate repeated headings per the markdownlint hints, but that’s optional.

Also applies to: 1009-1029

external-providers/generic-external-provider/pkg/server_configurations/nodejs/import_search_test.go (1)

390-423: Clarify intent for imports inside comments/strings in future changes.

The edge‑case tests explicitly expect “imports” that appear in comments or string literals to be counted, with notes that this is an acceptable limitation. That’s fine for now, but if you later harden findImportStatements to ignore comments/strings, remember these tests will need to be updated to reflect the stricter behavior (and the inline comments can be revised accordingly).

Also applies to: 425-592

external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (1)

191-245: Import-based EvaluateReferenced flow is sound and matches the intended design.

The new implementation cleanly separates concerns:

  • Walk workspace TS/JS files (skipping node_modules) into nodeFiles.
  • Use findImportStatements for a fast, non‑LSP import scan.
  • For each import, open the file once, resolve definitions via textDocument/definition with bounded exponential backoff, then collect references via GetAllReferences.
  • Deduplicate by definition (processedDefinitions) and by incident JSON key, and filter to workspace folders while excluding dependency folders.

This aligns well with the RCA and should avoid the previous workspace/symbol issues. The backoff window (50/100/200ms) is modest but reasonable; if you later see slow TS servers, you can make maxRetries/delays configurable.

Also applies to: 270-355, 357-420

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e85cc89 and dd4a62a.

📒 Files selected for processing (5)
  • CODE_REVIEW_RESPONSE_PLAN.md (1 hunks)
  • NODEJS_REFERENCED_ANALYSIS.md (1 hunks)
  • external-providers/generic-external-provider/pkg/server_configurations/nodejs/import_search_test.go (1 hunks)
  • external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (9 hunks)
  • test-patternfly-direct.sh (1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-10-21T16:25:40.903Z
Learnt from: shawn-hurley
Repo: konveyor/analyzer-lsp PR: 860
File: lsp/base_service_client/base_service_client.go:212-214
Timestamp: 2025-10-21T16:25:40.903Z
Learning: In `lsp/base_service_client/base_service_client.go`, when an RPC connection is provided via `c.RPC` (not nil), it is treated as already initialized and no handler setup is performed. This is known technical debt that needs to be addressed in the future to ensure LSP notifications like `textDocument/publishDiagnostics` are properly processed.

Applied to files:

  • external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go
📚 Learning: 2025-07-30T12:11:45.673Z
Learnt from: pranavgaikwad
Repo: konveyor/analyzer-lsp PR: 859
File: external-providers/java-external-provider/pkg/java_external_provider/dependency.go:694-694
Timestamp: 2025-07-30T12:11:45.673Z
Learning: In the Java external provider dependency walker (external-providers/java-external-provider/pkg/java_external_provider/dependency.go), errors from toDependency function calls should be ignored as they are not considered important by the maintainers.

Applied to files:

  • NODEJS_REFERENCED_ANALYSIS.md
🧬 Code graph analysis (2)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (2)
provider/provider.go (4)
  • Position (324-342)
  • Location (319-322)
  • IncidentContext (309-317)
  • Dep (482-482)
lsp/protocol/tsprotocol.go (4)
  • DefinitionParams (830-834)
  • TextDocumentPositionParams (4499-4504)
  • TextDocumentIdentifier (4478-4481)
  • Range (3559-3564)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/import_search_test.go (1)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (2)
  • NodeServiceClient (36-41)
  • FileInfo (799-802)
🪛 LanguageTool
CODE_REVIEW_RESPONSE_PLAN.md

[style] ~121-~121: To elevate your writing, try using a synonym here.
Context: ... statement end detection This makes it hard to read, test, and maintain. **Planned...

(HARD_TO)

NODEJS_REFERENCED_ANALYSIS.md

[grammar] ~569-~569: Use a hyphen to join words.
Context: .../tackle2-ui/client - Compare symbol finding results ### Investigation Phase...

(QB_NEW_EN_HYPHEN)

🪛 markdownlint-cli2 (0.18.1)
NODEJS_REFERENCED_ANALYSIS.md

208-208: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


258-258: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


265-265: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


273-273: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


349-349: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


677-677: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


757-757: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


1032-1032: Multiple headings with the same content

(MD024, no-duplicate-heading)


1051-1051: Multiple headings with the same content

(MD024, no-duplicate-heading)

🪛 Shellcheck (0.11.0)
test-patternfly-direct.sh

[warning] 70-70: Don't use ls | grep. Use a glob or a for loop with a condition to allow non-alphanumeric filenames.

(SC2010)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: test
  • GitHub Check: test (windows-latest, windows, amd64)
  • GitHub Check: test (ubuntu-latest, linux, amd64)
  • GitHub Check: test (macos-latest, darwin, amd64)
  • GitHub Check: test (macos-latest, darwin, arm64)
  • GitHub Check: test (ubuntu-latest, linux, arm64)
  • GitHub Check: benchmark (windows-latest, windows)
  • GitHub Check: benchmark (ubuntu-latest, linux)
  • GitHub Check: benchmark (macos-latest, mac)
🔇 Additional comments (8)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/import_search_test.go (3)

11-68: normalizeMultilineImports tests look comprehensive.

The test matrix here exercises single‑line, multiline, type‑only, string‑literal, and comment cases and matches the current normalization behavior. This gives good confidence that future changes to normalizeMultilineImports will be caught.


70-388: findImportStatements tests thoroughly capture edge cases.

The table‑driven tests cover named/default/mixed imports, namespace imports (including multiline), type‑only imports, word boundaries, side‑effect imports, and stress cases. They closely mirror the supported patterns in findImportStatements, which is exactly what we want for guarding the regex and boundary logic.


594-648: Benchmark is realistic and targeted.

The benchmark uses a representative React/PatternFly snippet and drives FindImportStatementsPublic in a tight loop, which should give a meaningful signal if future regex or normalization changes regress performance.

external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (5)

97-107: WorkspaceFolders sync and init logging look correct.

Copying Config.WorkspaceFolders into BaseConfig.WorkspaceFolders at init time fixes the earlier inconsistency and makes the later workspace‑based filters safe. The added init log with rootURI, workspace folders, and LSP path/name will also be very helpful when diagnosing misconfiguration.


365-407: Reference filtering and incident construction are consistent with existing patterns.

The workspace filter now guards against empty WorkspaceFolders and uses the configured dependency folders to drop incidents in dependency paths before constructing IncidentContext entries. The JSON‑keyed deduplication mirrors the existing EvaluateSymbols logic, so behavior should be predictable for downstream consumers.


486-641: Import scanning and position resolution look correct and robust.

findImportStatements:

  • Uses a single regex that handles named, default, namespace, mixed, and import type patterns.
  • Distinguishes default vs named vs namespace via capture groups and only proceeds when the pattern is actually present in one of them.
  • Normalizes multiline imports but preserves a 1:1 character mapping, then maps importStart back into original lines using a char counter.
  • Scans each candidate line for all occurrences of the pattern with identifier‑boundary checks, fixing the previously noted false‑negative case when the pattern is a prefix of another symbol on the same line.

Given the extensive tests in import_search_test.go, this looks solid.


643-781: normalizeMultilineImports handles strings, braces, and “from” detection correctly.

The normalizer:

  • Tracks string state with robust escape counting, avoiding premature termination on \\" sequences.
  • Tracks brace depth outside strings and only considers an import “complete” when braceDepth is zero and the current import snippet matches fromKeywordRegex.
  • Restricts “from” detection to the current import snippet instead of the whole file, which avoids cross‑statement interference.
  • Preserves overall character positions by replacing newlines within imports with spaces or a newline at the end of the statement.

This matches the documented goals and should support the regex‑based import scan well.


783-794: Empty dependency overrides are appropriate for this provider.

Overriding GetDependencies and GetDependenciesDAG to return empty maps is a clean way to satisfy the interface and avoid the “dependency provider path not set” error for NodeJS, which doesn’t use an external dependency provider.

Addresses CodeRabbit feedback:
- Remove unused LOG_FILE variable
- Replace with CONSOLE_LOG for consistency
- Update all references to use correct log file variables

The script now correctly references console.log throughout.

Signed-off-by: tsanders <[email protected]>
These files are local development artifacts:
- CODE_REVIEW_RESPONSE_PLAN.md - planning document
- NODEJS_REFERENCED_ANALYSIS.md - analysis notes
- test-patternfly-direct.sh - local test script

Keeping them local only.

Signed-off-by: tsanders <[email protected]>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
test-patternfly-direct.sh (3)

70-70: Use glob or for-loop instead of ls | grep to avoid shellcheck SC2010.

Piping ls to grep can fail with special filenames and is flagged by shellcheck. Use a glob-based approach instead:

-RULE_COUNT=$(ls -1 "${RULESET_DIR}"/*.yaml 2>/dev/null | grep -v ruleset.yaml | wc -l | tr -d ' ')
+RULE_COUNT=0
+for f in "${RULESET_DIR}"/*.yaml; do
+  [[ "$(basename "$f")" != ruleset.yaml ]] && ((RULE_COUNT++))
+done

129-129: Quote the PID variable for consistency and safety.

-        kill -9 ${PID} 2>/dev/null || true
+        kill -9 "${PID}" 2>/dev/null || true

148-148: Quote PROVIDER_PORT for consistency with the rest of the script.

-"${PROVIDER_BIN}" --port ${PROVIDER_PORT} > "${PROVIDER_LOG}" 2>&1 &
+"${PROVIDER_BIN}" --port "${PROVIDER_PORT}" > "${PROVIDER_LOG}" 2>&1 &
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dd4a62a and 9219054.

📒 Files selected for processing (1)
  • test-patternfly-direct.sh (1 hunks)
🧰 Additional context used
🪛 Shellcheck (0.11.0)
test-patternfly-direct.sh

[warning] 70-70: Don't use ls | grep. Use a glob or a for loop with a condition to allow non-alphanumeric filenames.

(SC2010)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: test
  • GitHub Check: test (macos-latest, darwin, amd64)
  • GitHub Check: test (windows-latest, windows, amd64)
  • GitHub Check: test (ubuntu-latest, linux, arm64)
  • GitHub Check: test (macos-latest, darwin, arm64)
  • GitHub Check: test (ubuntu-latest, linux, amd64)
  • GitHub Check: benchmark (ubuntu-latest, linux)
  • GitHub Check: benchmark (macos-latest, mac)
  • GitHub Check: benchmark (windows-latest, windows)
🔇 Additional comments (1)
test-patternfly-direct.sh (1)

1-259: Overall script structure and error handling look solid.

The script is well-organized with clear pre-flight checks, error handling, user feedback via color output, and proper cleanup. The past review's LOG_FILE issue appears to be resolved (now using CONSOLE_LOG consistently). The hardcoded provider log patterns at line 241 fail gracefully if absent, which is acceptable for optional performance metrics.

Please confirm that the hardcoded log patterns at line 241 (e.g., "import-based search complete") are actually emitted by the provider logs so that users see meaningful performance metrics; if these patterns are unreliable or rarely present, consider simplifying this section or adding a fallback message.

Extracts complex logic into well-documented helper functions for better
maintainability and testability.

**New helper functions:**
- `isStringDelimiter()` - Checks for quotes and backticks
- `isEscapedQuote()` - Counts preceding backslashes to detect escaped quotes
- `hasCompletedImportStatement()` - Detects "from" keyword to identify complete imports
- `normalizeImportStatement()` - Processes individual import statement normalization

**Main function improvements:**
- `normalizeMultilineImports()` now delegates to helpers
- Added `\r\n` → `\n` preprocessing for cross-platform compatibility
- Simplified switch statement in import processing
- Removed deeply nested "from" detection logic

**Benefits:**
- Each helper is independently testable
- Main function logic is clearer (26 lines vs 100 lines)
- Better separation of concerns
- Improved code documentation

**Cross-platform line ending handling:**
- Normalizes Windows `\r\n` to Unix `\n` at start
- Simplifies processing to only check `\n` in switch
- Added comments explaining line ending approach

**Test coverage:**
Added 5 new test cases:
- Windows line endings (CRLF) normalization
- Mixed line endings normalization
- Escaped quotes in import strings
- Double backslash before quote
- Template literals with backticks

All 10 normalization tests passing + 27 import tests passing.

Addresses code review feedback items #2, konveyor#3, and konveyor#4.

Signed-off-by: tsanders <[email protected]>
@tsanders-rh
Copy link
Contributor Author

Thanks for the feedback on improving the normalizeMultilineImports function! I've refactored it with helper methods and addressed the line
separator concerns.

Changes Made

  1. Extracted Helper Functions

The ~100-line function is now split into focused, testable helpers:

// String handling helpers
isStringDelimiter(ch byte) bool
isEscapedQuote(content string, pos int) bool

// Import detection helper
hasCompletedImportStatement(content, importStart, currentPos int) bool

// Import processing helper
normalizeImportStatement(result, content, pos, importStart int) int

  1. Simplified Main Function

normalizeMultilineImports is now 26 lines (down from 100), focusing on:

  • Preprocessing: \r\n → \n normalization for cross-platform compatibility
  • Finding "import" keywords
  • Delegating to normalizeImportStatement helper
  1. Cleaner Import Processing

The deeply nested "from" detection logic (lines 694-710) is now extracted to hasCompletedImportStatement:

Before:
if braceDepth == 0 && i > importStart+6 {
snippet := content[importStart:min(i+1, len(content))]
if len(snippet) > 10 {
start := len(snippet) - min(50, len(snippet))
last50 := snippet[start:]
if fromKeywordRegex.MatchString(last50) {
// ...
}
}
}

After:
if braceDepth == 0 && sc.hasCompletedImportStatement(content, importStart, i) {
result.WriteByte('\n')
i++
return i
}

  1. Cross-Platform Line Ending Handling

Instead of checking both \r and \n throughout, we now:

  • Normalize \r\n → \n at the start of processing
  • Only check \n in the switch statement
  • Added comment explaining the approach

func (sc *NodeServiceClient) normalizeMultilineImports(content string) string {
// Normalize Windows line endings to Unix for consistent processing
content = strings.ReplaceAll(content, "\r\n", "\n")
// ... rest of function only handles \n
}

Benefits

  • Testability: Each helper can be tested independently
  • Readability: Main function flow is clear at a glance
  • Maintainability: Changes to escape handling or "from" detection are isolated
  • Documentation: Each function has clear purpose and comments

Test Coverage

Added 5 new test cases specifically for the helper improvements:

  • ✅ Windows line endings (CRLF) normalization
  • ✅ Mixed line endings normalization
  • ✅ Escaped quotes in import strings (")
  • ✅ Double backslash before quote (\")
  • ✅ Template literals with backticks

All 10 normalization tests + 27 import tests passing.

Code Stats

  • Main function: 100 lines → 26 lines (74% reduction)
  • Total added: 4 well-documented helper functions
  • Test coverage: 5 → 10 test cases for normalization

Commit: e139b38

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (6)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (4)

191-229: Harden workspace path handling and multi-root filtering

The overall flow (scan workspace → resolve imports → references → filter to workspace/deps) is solid, but there are two robustness points worth tightening:

  1. Path conversion for filepath.Walk

    • folder := strings.TrimPrefix(sc.Config.WorkspaceFolders[0], "file://") assumes a file:// URI and does a raw string trim.
    • For robustness (URI encoding, Windows paths, future non-file schemes) it would be safer to parse as a URI and derive the filesystem path, e.g. via uri.Parse + u.Filename() before calling filepath.Walk.
  2. Workspace filtering on references

    • The filter uses only WorkspaceFolders[0] and strings.Contains(ref.URI, sc.BaseConfig.WorkspaceFolders[0]). In multi-root workspaces or when one root is a substring of another path, this can misclassify references.
    • Consider iterating all configured workspace folders and using a stricter check (e.g., URI-normalized HasPrefix or parsing to uri.URI and comparing by path root) to avoid false positives/negatives.

These changes would make the algorithm more resilient without affecting the happy path behavior you validated.

Also applies to: 365-387


274-355: Consider making LSP retry/backoff configurable and context-aware

The definition resolution loop with exponential backoff (50/100/200ms) is a big improvement over a fixed sleep, but it’s still fully hardcoded:

  • Different language servers or project sizes may need longer or shorter backoff.
  • The time.Sleep calls don’t consult ctx, so cancellation is only honored at the next RPC.

If you want to future‑proof this path, consider:

  • Surfacing retry count and initial backoff via config (or at least constants), and
  • Using a select on ctx.Done() vs time.After(delay) to avoid sleeping after cancellation.

Not blocking for this PR, but it would make this hot path more tunable and responsive.


507-641: Import search and normalization are well thought out; a couple of small refinements

The import scanning + normalization logic is nicely structured and covers a lot of real‑world cases (multiline, mixed default/named/namespace, type imports, word boundaries). Two minor improvement opportunities:

  1. Reuse compiled import regex
    importRegex := regexp.MustCompile(...) is allocated on every findImportStatements call. Since the pattern is static, moving it to a package‑level var similar to fromKeywordRegex would avoid repeated compilation in large workspaces:
-var (
-	// fromKeywordRegex matches "from" as a standalone keyword followed by a quote
-	fromKeywordRegex = regexp.MustCompile(`\bfrom\s+["']`)
-)
+var (
+	// fromKeywordRegex matches "from" as a standalone keyword followed by a quote
+	fromKeywordRegex = regexp.MustCompile(`\bfrom\s+["']`)
+
+	// importRegex matches ES module import statements including type-only, default,
+	// named, and namespace forms.
+	importRegex = regexp.MustCompile(
+		`import\s+(?:type\s+)?(?:(\w+)\s*,\s*)?(?:\{([^}]*)\}|(\w+)|\*\s+as\s+(\w+))\s+from\s+['"]([^'"]+)['"]`,
+	)
+)
  1. Optional: skip scanning far past the import when mapping back to lines
    When mapping from normalized indices back to original lines, you currently search up to 20 lines ahead. That’s safe and simple, but if you ever want to tighten it, you could track how far the import actually spans (e.g., via brace depth or hasCompletedImportStatement) and limit the search window accordingly.

The current behavior is correct and well covered by tests; these are small efficiency/maintainability tweaks.

Also applies to: 668-780, 810-836


838-880: Public testing helpers slightly expand the API surface

FileInfo, NormalizeMultilineImportsPublic, FindImportStatementsPublic, and IsIdentifierCharPublic are clearly intended only for tests, but they now become part of the exported surface of the nodejs package.

Two possible alternatives:

  • Move the tests into the same package (package nodejs) and call the unexported helpers directly; or
  • Keep the public wrappers but gate them with a comment like “// For tests only; not part of the supported API” and avoid using them from production code.

Not required for this PR, but tightening the exposed surface will make future refactors of the import logic easier.

external-providers/generic-external-provider/pkg/server_configurations/nodejs/import_search_test.go (2)

95-373: Import search tests are comprehensive and accurately target past edge cases

The TestFindImportStatements, TestImportWordBoundaries, and TestImportEdgeCases suites collectively:

  • Cover named/default/namespace/mixed/type-only imports, multiline cases, and stress scenarios.
  • Assert precise line/column positions, including the previously tricky “pattern as prefix of another symbol on the same line”.
  • Explicitly document current limitations (e.g., treating imports in comments/strings as matches).

This is excellent coverage for the new import-based algorithm and will prevent regressions. If you later decide to ignore commented/string “imports”, you can just relax the expectations in TestImportEdgeCases; the tests already call out that behavior as an implementation detail.

Also applies to: 450-533, 535-617, 619-673


415-448: Clarify subtest names in TestIsIdentifierChar

Using t.Run(string(tt.char), ...) works but produces subtest names with control characters like newline and tab, which can make test output harder to read.

You might consider giving explicit names or hex representations instead:

-	for _, tt := range tests {
-		t.Run(string(tt.char), func(t *testing.T) {
+	for _, tt := range tests {
+		name := fmt.Sprintf("char_%q_0x%X", tt.char, tt.char)
+		t.Run(name, func(t *testing.T) {
 			result := sc.IsIdentifierCharPublic(tt.char)
 			if result != tt.expected {
 				t.Errorf("isIdentifierChar(%q) = %v, want %v", tt.char, result, tt.expected)
 			}
 		})
 	}

Not critical, but it will make failing cases easier to interpret.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9219054 and e139b38.

📒 Files selected for processing (2)
  • external-providers/generic-external-provider/pkg/server_configurations/nodejs/import_search_test.go (1 hunks)
  • external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (9 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-21T16:25:40.903Z
Learnt from: shawn-hurley
Repo: konveyor/analyzer-lsp PR: 860
File: lsp/base_service_client/base_service_client.go:212-214
Timestamp: 2025-10-21T16:25:40.903Z
Learning: In `lsp/base_service_client/base_service_client.go`, when an RPC connection is provided via `c.RPC` (not nil), it is treated as already initialized and no handler setup is performed. This is known technical debt that needs to be addressed in the future to ensure LSP notifications like `textDocument/publishDiagnostics` are properly processed.

Applied to files:

  • external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go
🧬 Code graph analysis (2)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (2)
provider/provider.go (4)
  • Position (324-342)
  • Location (319-322)
  • IncidentContext (309-317)
  • Dep (482-482)
lsp/protocol/tsprotocol.go (4)
  • DefinitionParams (830-834)
  • TextDocumentPositionParams (4499-4504)
  • TextDocumentIdentifier (4478-4481)
  • Range (3559-3564)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/import_search_test.go (1)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (2)
  • NodeServiceClient (36-41)
  • FileInfo (854-857)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: test
  • GitHub Check: test (ubuntu-latest, linux, arm64)
  • GitHub Check: test (ubuntu-latest, linux, amd64)
  • GitHub Check: test (macos-latest, darwin, amd64)
  • GitHub Check: test (macos-latest, darwin, arm64)
  • GitHub Check: test (windows-latest, windows, amd64)
  • GitHub Check: benchmark (ubuntu-latest, linux)
  • GitHub Check: benchmark (macos-latest, mac)
  • GitHub Check: benchmark (windows-latest, windows)
🔇 Additional comments (3)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (1)

97-106: BaseConfig workspace sync and init logging look good

Synchronizing BaseConfig.WorkspaceFolders with Config.WorkspaceFolders here should resolve the earlier inconsistency in reference filtering, and the structured init log is useful for diagnosing LSP setup issues. No changes needed.

external-providers/generic-external-provider/pkg/server_configurations/nodejs/import_search_test.go (2)

11-93: Normalization tests give good confidence in multiline handling

The TestNormalizeMultilineImports table nicely exercises single-line, multiline, CRLF/mixed endings, comments, and tricky string/escape cases. This should make refactoring normalizeMultilineImports and its helpers safe. No changes needed.


619-673: Benchmark is well-structured and representative

BenchmarkFindImportStatements uses a realistic TSX snippet with multiple imports and resets the timer before the loop, so it should give a useful signal if the import search logic regresses performance-wise. No changes needed.

Follow Go testing conventions by accessing private methods directly
from tests in the same package. This eliminates unnecessary public
wrappers that were only used for testing.

Changes:
- Changed import_search_test.go package from nodejs_test to nodejs
- Removed FileInfo exported struct
- Removed NormalizeMultilineImportsPublic wrapper
- Removed FindImportStatementsPublic wrapper
- Removed IsIdentifierCharPublic wrapper
- Updated all test calls to use private methods directly

All import search tests pass (37 test cases across 5 test functions).

Addresses code review feedback on public vs private method exposure.

Signed-off-by: tsanders <[email protected]>
@tsanders-rh
Copy link
Contributor Author

Changes in commit 101422c:

Removed unnecessary public test wrapper methods and followed Go testing conventions by accessing private methods directly from tests in the same
package.

What changed:

  1. Changed import_search_test.go package declaration from package nodejs_test to package nodejs
  2. Removed 4 public wrapper methods that existed only for testing:
    - NormalizeMultilineImportsPublic → direct call to normalizeMultilineImports
    - FindImportStatementsPublic → direct call to findImportStatements
    - IsIdentifierCharPublic → direct call to isIdentifierChar
    - FileInfo exported struct → use private fileInfo struct
  3. Updated all test method calls to use private methods directly

Why this is better:

  • Follows Go conventions: Tests in the same package can access unexported (private) identifiers directly - no wrappers needed
  • Cleaner API surface: Removes 35 lines of code that served no purpose beyond testing
  • No functionality change: All 37 test cases still pass - this is purely organizational

Test coverage:

  • ✅ 10 normalization tests
  • ✅ 27 import finding tests
  • ✅ 12 identifier character tests
  • ✅ 6 word boundary tests
  • ✅ 7 edge case tests

This addresses the code review feedback about exposing implementation details through public methods.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (7)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (5)

191-224: Consider tightening or making file scan exclusions configurable

The recursive filepath.Walk over the entire workspace skips only node_modules. In large monorepos this could get expensive (e.g., dist, .git, build artifacts). Not urgent, but consider either:

  • Reusing any existing file filters in the base client, or
  • Adding configurable/externalized exclusion patterns beyond node_modules.

232-355: LSP open/definition/references flow is generally solid; minor robustness tweaks possible

The flow of:

  • lazily didOpen per source file,
  • retrying textDocument/definition with short exponential backoff, and
  • using processedDefinitions to dedupe

looks correct and addresses the earlier fixed-sleep concern. Two small robustness nits you could consider:

  • Check ctx.Err() before sleeping in the retry loop so a cancelled context doesn’t wait the extra backoff.
  • Optionally log the “no definitions but no error” case at V(1) to aid debugging silent misses.

These are polish items; the current behavior is acceptable.


365-407: Workspace/dependency filtering assumes first workspace folder; consider multi-root support

The filtering:

if len(sc.BaseConfig.WorkspaceFolders) == 0 { continue }
if !strings.Contains(ref.URI, sc.BaseConfig.WorkspaceFolders[0]) { continue }

plus dependency-folder exclusion matches the existing EvaluateSymbols behavior, but it effectively ignores any additional workspace folders. If multi-root workspaces matter for this provider, consider iterating over all configured workspace folders instead of only index 0.


393-407: Incident construction consistent with existing code, but character positions are dropped

The incident mapping mirrors EvaluateSymbols (same LineNumber and Location shape), which is good for consistency. Note that StartPosition/EndPosition set only Line and always leave Character at 0. If downstream consumers ever want column-granular UX, you may eventually want to plumb ref.Range.Start.Character through as well.


507-641: findImportStatements: coverage is strong; watch identifier/charset limitations

The regex + normalization + word-boundary search covers a wide variety of import forms (default, named, namespace, mixed, and type imports) and the multiple-occurrence scan fixes the earlier Card/CardFooter false negative. Two non-blocking observations:

  • \w+ in the regex excludes $ and non-ASCII identifier characters, even though both are valid in JS/TS identifiers. If you need to support imports like import $ from 'jquery'; or Unicode identifiers, you may want to relax this (e.g., custom character class aligned with isIdentifierChar).
  • strings.Contains(namedImports, pattern) can still set patternFound when the only occurrences are substrings (e.g., CardBody). The subsequent word-boundary search safely eliminates false positives, but you could tighten patternFound to check boundaries as well to avoid some unnecessary scans.

Given current use cases, this is fine; consider adjustments only if you see real-world misses.

external-providers/generic-external-provider/pkg/server_configurations/nodejs/import_search_test.go (2)

94-411: findImportStatements tests are thorough; consider asserting more than the first match

The matrix of cases (named/default/multiline, namespace, mixed imports, type imports, word-boundary scenarios, and the Card/CardFooter regression) is excellent. One possible enhancement: in scenarios where multiple valid import locations could exist, extending assertions to cover all returned locations (not just locations[0]) would make the tests resilient to future changes in ordering.


531-613: Edge-case tests document known limitations around comments/strings

The expectations for imports inside comments and strings being treated as matches (with explanatory comments) are useful and align with the current implementation. It might be worth calling out in provider-level docs that these pseudo-import matches are further filtered by LSP (i.e., usually yield no definitions), so they don’t create false-positive incidents, just minor extra work.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e139b38 and 101422c.

📒 Files selected for processing (2)
  • external-providers/generic-external-provider/pkg/server_configurations/nodejs/import_search_test.go (1 hunks)
  • external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (9 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-21T16:25:40.903Z
Learnt from: shawn-hurley
Repo: konveyor/analyzer-lsp PR: 860
File: lsp/base_service_client/base_service_client.go:212-214
Timestamp: 2025-10-21T16:25:40.903Z
Learning: In `lsp/base_service_client/base_service_client.go`, when an RPC connection is provided via `c.RPC` (not nil), it is treated as already initialized and no handler setup is performed. This is known technical debt that needs to be addressed in the future to ensure LSP notifications like `textDocument/publishDiagnostics` are properly processed.

Applied to files:

  • external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go
🧬 Code graph analysis (2)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/import_search_test.go (1)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (1)
  • NodeServiceClient (36-41)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (2)
provider/provider.go (4)
  • Position (324-342)
  • Location (319-322)
  • IncidentContext (309-317)
  • Dep (482-482)
lsp/protocol/tsprotocol.go (4)
  • DefinitionParams (830-834)
  • TextDocumentPositionParams (4499-4504)
  • TextDocumentIdentifier (4478-4481)
  • Range (3559-3564)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: test
  • GitHub Check: benchmark (windows-latest, windows)
  • GitHub Check: test (ubuntu-latest, linux, amd64)
  • GitHub Check: benchmark (ubuntu-latest, linux)
  • GitHub Check: test (ubuntu-latest, linux, arm64)
  • GitHub Check: benchmark (macos-latest, mac)
  • GitHub Check: test (windows-latest, windows, amd64)
  • GitHub Check: test (macos-latest, darwin, amd64)
  • GitHub Check: test (macos-latest, darwin, arm64)
🔇 Additional comments (10)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (6)

97-107: WorkspaceFolders synchronization and init logging look correct

Syncing sc.BaseConfig.WorkspaceFolders from sc.Config.WorkspaceFolders after base client construction fixes the earlier inconsistency and makes the later workspace-based filtering coherent with the scan root. The added init log is also useful for diagnosing misconfigured roots.


431-484: WorkspaceFolders bounds check prevents panics

The added len(sc.BaseConfig.WorkspaceFolders) == 0 guard before indexing [0] in EvaluateSymbols fixes the prior panic risk while preserving existing behavior for non-empty configs.


667-780: normalizeMultilineImports / normalizeImportStatement handle edge cases well

The refactored normalization:

  • Normalizes CRLF → LF,
  • Skips “import” inside larger identifiers,
  • Preserves string literals via isStringDelimiter/isEscapedQuote,
  • Uses hasCompletedImportStatement + fromKeywordRegex to decide when to stop collapsing newlines.

This matches the documented behavior and should be robust for typical TS/JS imports, including multiline and type imports. The separation into helpers also makes future tweaks (e.g., additional termination conditions) straightforward.


782-804: String escape handling is now correct for sequences like \\"

The new isEscapedQuote that counts preceding backslashes and uses parity to determine escaping correctly handles both single and double backslashes before quotes. This addresses the previous edge-case concern around sequences like \\".


806-828: fromKeywordRegex + scoped snippet nicely avoid cross-statement “from” interference

hasCompletedImportStatement now:

  • Restricts inspection to the current import slice (content[importStart:currentPos]),
  • Uses fromKeywordRegex to match from as a standalone word followed by a quote.

This avoids matching from in other parts of the file or inside identifiers/strings and fixes the earlier false-positive risk in multi-import files.


838-849: Dependency overrides are appropriate for this provider

Overriding GetDependencies and GetDependenciesDAG to return empty maps is a clean way to avoid the base “dependency provider path not set” error for nodejs, and keeps the interface surface consistent for callers.

external-providers/generic-external-provider/pkg/server_configurations/nodejs/import_search_test.go (4)

10-91: normalizeMultilineImports tests give good coverage of tricky cases

The table-driven tests exercise single-line, multiline, CRLF/mixed endings, comments, escaped quotes (including \" and \\\"), and backticks. This gives strong confidence that the normalization logic won’t regress on common TS/JS formatting patterns.


414-444: isIdentifierChar tests align with implementation

The positive/negative character set in TestIsIdentifierChar matches the helper’s current ASCII-focused definition and should catch most accidental regressions in identifier-boundary logic.


447-529: Word-boundary tests clearly lock in the intended matching semantics

TestImportWordBoundaries nicely captures the desired behavior for avoiding substring matches (prefix/suffix) while still matching first/middle/last positions in a named import list. This is a good safeguard around the search loop behavior in findImportStatements.


615-669: Benchmark is realistic and appropriate

The benchmark uses a representative TSX snippet with multiple imports and JSX, which should give a meaningful signal about findImportStatements performance over time.

Make workspace filtering conditional based on the analysis mode setting.
This allows users to control whether they want comprehensive analysis
or source-only analysis.

Behavior:
- source-only mode: Filter references to workspace folders only,
  excluding dependency folders (existing behavior preserved)
- full mode: Include all references regardless of location

Changes:
- Added AnalysisMode field to NodeServiceClient
- Store analysis mode from InitConfig during initialization
- Apply workspace filtering only when AnalysisMode is source-only
- Log analysis mode in initialization output

This addresses the code review feedback about making workspace
filtering configurable based on the user's analysis intent.

Addresses shawn-hurley's feedback on workspace filtering.

Signed-off-by: tsanders <[email protected]>
Refactor to follow the same pattern as builtin provider, building results
incrementally during filesystem traversal rather than collecting all file
paths first then searching them.

Changes:
- Extract findImportStatementsInFile to search a single file
- Call findImportStatementsInFile during filepath.Walk for each file
- Accumulate results incrementally as files are processed
- Simplify findImportStatements to delegate to findImportStatementsInFile
  for backward compatibility with tests
- Improve logging to show files processed and imports found

Benefits:
- Single pass through filesystem (more efficient)
- Follows established pattern in builtin provider
- More memory efficient (don't hold all file paths in memory)
- Cleaner separation: one function per file, one function for batches

No functional changes - all 37 import search tests pass.

Addresses shawn-hurley's feedback about following builtin provider pattern.

Signed-off-by: tsanders <[email protected]>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (2)

523-526: Consider compiling the import regex once at package level.

The import regex is compiled on every call to findImportStatementsInFile, which can impact performance when scanning hundreds of files. Similar to the fromKeywordRegex at line 24, this regex could be compiled once as a package-level variable to eliminate repeated compilation overhead.

Apply this diff to move the regex to package level:

 var (
 	// fromKeywordRegex matches "from" as a standalone keyword followed by a quote
 	fromKeywordRegex = regexp.MustCompile(`\bfrom\s+["']`)
+	// importRegex matches all valid import statement patterns
+	importRegex = regexp.MustCompile(`import\s+(?:type\s+)?(?:(\w+)\s*,\s*)?(?:\{([^}]*)\}|(\w+)|\*\s+as\s+(\w+))\s+from\s+['"]([^'"]+)['"]`)
 )

Then at line 524-526, replace the regexp.MustCompile call with:

-	// Regex to match all valid import statement patterns
-	importRegex := regexp.MustCompile(
-		`import\s+(?:type\s+)?(?:(\w+)\s*,\s*)?(?:\{([^}]*)\}|(\w+)|\*\s+as\s+(\w+))\s+from\s+['"]([^'"]+)['"]`,
-	)

586-586: Consider documenting the 20-line search window limit.

The hardcoded 20-line search window is a reasonable default for typical import statements, but exceptionally large multiline imports might be missed. Consider adding a brief comment explaining this tradeoff, or making it configurable if you encounter real-world cases where imports span more than 20 lines.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3f862a2 and b68e524.

📒 Files selected for processing (1)
  • external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (10 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-21T16:25:40.903Z
Learnt from: shawn-hurley
Repo: konveyor/analyzer-lsp PR: 860
File: lsp/base_service_client/base_service_client.go:212-214
Timestamp: 2025-10-21T16:25:40.903Z
Learning: In `lsp/base_service_client/base_service_client.go`, when an RPC connection is provided via `c.RPC` (not nil), it is treated as already initialized and no handler setup is performed. This is known technical debt that needs to be addressed in the future to ensure LSP notifications like `textDocument/publishDiagnostics` are properly processed.

Applied to files:

  • external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go
🧬 Code graph analysis (1)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (2)
provider/provider.go (6)
  • AnalysisMode (116-116)
  • Position (324-342)
  • Location (319-322)
  • IncidentContext (309-317)
  • SourceOnlyAnalysisMode (120-120)
  • Dep (482-482)
lsp/protocol/tsprotocol.go (4)
  • DefinitionParams (830-834)
  • TextDocumentPositionParams (4499-4504)
  • TextDocumentIdentifier (4478-4481)
  • Range (3559-3564)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: test
  • GitHub Check: test (windows-latest, windows, amd64)
  • GitHub Check: test (macos-latest, darwin, amd64)
  • GitHub Check: test (macos-latest, darwin, arm64)
  • GitHub Check: test (ubuntu-latest, linux, arm64)
  • GitHub Check: test (ubuntu-latest, linux, amd64)
  • GitHub Check: benchmark (macos-latest, mac)
  • GitHub Check: benchmark (windows-latest, windows)
  • GitHub Check: benchmark (ubuntu-latest, linux)
🔇 Additional comments (14)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (14)

22-25: LGTM: Global regex compilation is efficient and correct.

Compiling the regex once at package initialization avoids repeated compilation overhead. The pattern \bfrom\s+["'] correctly matches "from" as a standalone word followed by a quote, addressing the false-positive concerns from earlier reviews.


41-41: LGTM: AnalysisMode field enables flexible filtering behavior.

This field correctly stores the analysis mode from initialization and enables conditional workspace filtering at lines 384-404, allowing full-mode analysis to include all references while source-only mode filters to workspace folders.


98-111: LGTM: Initialization properly synchronizes configurations and adds useful logging.

The synchronization of BaseConfig.WorkspaceFolders from Config.WorkspaceFolders (lines 98-100) correctly addresses the critical consistency issue identified in past reviews where lines 360-438 relied on BaseConfig but it was never initialized. The AnalysisMode storage and debug logging are valuable additions for runtime behavior control and troubleshooting.


147-159: LGTM: Type definitions are well-structured and appropriately scoped.

ImportLocation is correctly exported for testing purposes with all necessary fields (FileURI, LangID, Position, Line) to track import statement locations. The unexported fileInfo struct appropriately encapsulates internal file tracking. Field types align well with LSP protocol types.


184-260: LGTM: File scanning setup is efficient and well-structured.

The approach of using filepath.Walk with filtering for TypeScript/JavaScript files and skipping node_modules is appropriate. Early validation of workspace folders and immediate return when no imports are found avoids unnecessary LSP operations. The logging provides good observability into the scanning process.


262-368: LGTM: LSP integration with retry logic is robust and well-designed.

The exponential backoff retry strategy (lines 320-333) correctly addresses the past review concern about hardcoded delays, giving the LSP server time to index newly opened files while failing gracefully after 3 attempts. The use of tracking maps for deduplication (processedDefinitions, openedFiles) prevents redundant LSP calls and ensures proper cleanup. Error handling appropriately logs issues and continues processing remaining imports rather than failing the entire operation.


370-447: LGTM: Reference filtering and incident creation are correctly implemented.

The mode-aware filtering (lines 384-404) properly distinguishes between source-only and full analysis modes, with appropriate bounds checking before accessing WorkspaceFolders[0] (line 386). The JSON-based deduplication strategy (line 423-424) effectively handles duplicate references. File cleanup (lines 370-376) ensures LSP resources are released, and the comprehensive summary logging (lines 432-437) provides excellent observability into the operation's results.


458-460: LGTM: Bounds check properly guards workspace folder access.

The addition of the bounds check before accessing WorkspaceFolders[0] prevents potential panics and aligns with the defensive programming practices applied elsewhere in this file.


638-647: LGTM: Helper method provides clean testing interface.

This delegation method simplifies testing by batching multiple file searches. The implementation correctly aggregates results from individual file searches.


673-705: LGTM: Multiline import normalization is well-refactored and efficient.

The CRLF-to-LF normalization (line 675) ensures consistent processing across platforms. The word-boundary check (line 686) correctly prevents false matches within identifiers. The refactoring into smaller helper functions (as noted in past reviews) has resulted in clear, maintainable code with good performance characteristics (builder preallocation at line 678).


724-786: LGTM: Import statement normalization handles edge cases correctly.

The state machine correctly handles string literals (lines 734-750), brace depth (lines 754-761), and statement boundaries. The use of isEscapedQuote (line 740) properly handles escape sequences including double-backslashes, addressing past review concerns. The scoped completion check (line 765 calling hasCompletedImportStatement) prevents cross-statement interference as resolved in earlier reviews.


788-842: LGTM: Helper functions are correct and well-tested.

All helper functions are properly implemented:

  • isIdentifierChar correctly identifies JavaScript identifier characters including $ and _
  • isStringDelimiter handles all string types including template literals
  • isEscapedQuote correctly counts preceding backslashes to determine escape status (addressing past review feedback)
  • hasCompletedImportStatement uses proper scoping and the compiled regex to detect completion
  • min is a standard utility implementation

These utilities are well-factored and support the main parsing logic effectively.


844-855: LGTM: Dependency stub methods appropriately return empty results.

These stubs correctly override the base implementation to return empty maps instead of errors, which is the appropriate behavior when the nodejs provider doesn't use external dependency analysis. The documentation clearly explains the rationale.


161-855: Excellent implementation of import-based search with thorough edge-case handling.

This implementation successfully addresses the TypeScript LSP workspace/symbol limitations by introducing a three-phase approach (import scanning → definition resolution → reference gathering). The code demonstrates several strengths:

Algorithm design:

  • Fast regex-based import scanning avoids expensive LSP operations on all files
  • Exponential backoff retry handles LSP indexing delays gracefully
  • Deduplication at multiple levels (definitions, references, incidents) prevents redundant work
  • Mode-aware filtering supports both source-only and full analysis

Code quality:

  • Comprehensive edge-case handling: multiline imports, escape sequences, word boundaries, mixed import types
  • Robust error handling that logs issues but continues processing
  • Well-factored helper functions improve testability and maintainability
  • Excellent observability through structured logging at appropriate verbosity levels

Past review integration:

  • All critical issues from past reviews have been properly addressed
  • Code reflects iterative improvements based on feedback

The implementation is production-ready. The only suggestions are optional performance optimizations (moving regex compilation to package level) and documentation enhancements (noting the 20-line search window).

@tsanders-rh
Copy link
Contributor Author

@pranavgaikwad - As we chatted in slack, you're absolutely right that the current per-rule approach is inefficient for multiple rules. Maybe there is an opportunity for us to combine the best from both of our approaches. Let's continue forward, and then decide where we stand.

@tsanders-rh
Copy link
Contributor Author

tsanders-rh commented Nov 20, 2025

@shawn-hurley - With regards to losing the non-import search capability. Right now I am focused on unlocking the Patternfly migration use case. For PatternFly migration rules specifically:

The vast majority of PatternFly usage follows standard ES6 import patterns, so the import-based strategy should catch 99%+ of cases. The edge cases are:

  • Re-exports: If the application has a pattern of re-exporting PatternFly components through wrapper files.
  • CommonJS requires: Unlikely in a modern React/TypeScript codebase.
  • Dynamic imports: Very rare for components.

My recommendation:

For the PatternFly migration rules, the import-based strategy is sufficient. The edge cases are so rare that:

  1. They likely don't exist in a modern React/TS app using standard imports.
  2. Even if they did exist, they'd represent bad practices worth refactoring anyway.
  3. We can always fall back to builtin.filecontent for specific edge cases if needed.

@tsanders-rh tsanders-rh marked this pull request as draft November 21, 2025 21:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants