Skip to content

Conversation

@tsanders-rh
Copy link
Contributor

@tsanders-rh tsanders-rh commented Nov 17, 2025

Summary

Eliminates dual bottlenecks in TypeScript/JavaScript file analysis that caused 20-25 minute delays on large codebases.

Fixes #969

Problem

The NodeJS provider had two major bottlenecks:

  1. Sleep bottleneck: 2-second sleep before EACH file

    • 582 files × 2s = 1,164 seconds (~19 minutes) of pure waiting
  2. Query bottleneck: GetAllDeclarations called 18 times

    • Once per 32-file batch instead of once total
    • Each call queries the entire workspace for symbols (expensive operation)

Solution

Modified external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go:

Before:

  • Batch processing (32 files at a time)
  • 2-second sleep before each didOpen notification
  • GetAllDeclarations called after each batch (18 times total)

After:

  • Open all files in single loop (no per-file sleep)
  • Single 500ms sleep after all didOpen notifications sent
  • Call GetAllDeclarations once after all files are indexed
  • Simplified didClose loop

The single sleep after all files prevents the race condition (symbol queries happening before LSP server finishes indexing) without the massive overhead of sleeping before each file.

Performance Results

Test environment: tackle2-ui (582 TypeScript files, 56 PatternFly rules)

Metric Before After Improvement
Total time 20-25 minutes 1m 43s 91% faster
Actual time 1200-1500s 103 seconds 12x speedup
Sleep overhead 1,164s 0.5s 99.96% reduction
Workspace queries 18 calls 1 call 94% reduction

Testing

  • ✅ Output file generated successfully (213 KB, 2,982 lines)
  • ✅ Violations detected correctly (PatternFly migration issues)
  • ✅ No errors or warnings in execution
  • ✅ Tested on both small and large TypeScript codebases

Additional Notes

This fix is even better than the original proposal because it eliminates both the sleep bottleneck AND the redundant workspace queries.

Summary by CodeRabbit

  • Bug Fixes
    • Optimized file processing flow for improved performance and reliability by consolidating file operations into a streamlined single-pass approach with coordinated resource management.

Eliminates dual bottlenecks in TypeScript/JavaScript file analysis:

**Problem:**
1. 2-second sleep before EACH file (582 files × 2s = 1,164s wasted)
2. GetAllDeclarations called 18 times (once per 32-file batch)

**Solution:**
- Open all files in single loop (no per-file sleep)
- Sleep 500ms once after all didOpen notifications
- Call GetAllDeclarations once after all files indexed
- Simplified didClose loop

**Performance Results:**
Test: tackle2-ui (582 TypeScript files, 56 rules)
- Before: 20-25 minutes (1200-1500s)
- After: 1m 43s (103s)
- Improvement: 91% faster (12x speedup)

The single sleep after all files prevents race condition where symbol
queries happen before LSP server finishes indexing, without the massive
overhead of sleeping before each file.

Signed-off-by: tsanders <[email protected]>
@coderabbitai
Copy link

coderabbitai bot commented Nov 17, 2025

Walkthrough

The NodeJS provider's file processing workflow is refactored to eliminate per-file 2-second sleeps. Instead of batching files incrementally with sleep-per-file, the system now opens all files in a single pass, applies one 500ms delay, queries symbols once, and closes all files in one consolidated pass.

Changes

Cohort / File(s) Change Summary
NodeJS Service Client Control Flow
external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go
Consolidated file processing and symbol querying: removed per-file 2-second sleeps before each didOpen notification, replaced with single 500ms sleep after all files are opened, collapsed symbol query into one operation, and unified file cleanup into single pass

Sequence Diagram

sequenceDiagram
    autonumber
    participant Client
    participant OldFlow as Old Flow (Per-File)
    participant NewFlow as New Flow (Batch)
    
    rect rgba(200, 100, 100, 0.2)
        Note over OldFlow: ❌ Incremental Batching
        loop For each file in batch
            OldFlow->>OldFlow: Sleep 2s
            OldFlow->>OldFlow: didOpen(file)
            OldFlow->>OldFlow: Query symbols
            OldFlow->>OldFlow: Close file
        end
        Note over OldFlow: 582 files × 2s = 19 min
    end
    
    rect rgba(100, 200, 100, 0.2)
        Note over NewFlow: ✅ Batch Processing
        NewFlow->>NewFlow: Open all files
        NewFlow->>NewFlow: Sleep 500ms (once)
        NewFlow->>NewFlow: Query all symbols
        NewFlow->>NewFlow: Evaluate
        NewFlow->>NewFlow: Close all files
        Note over NewFlow: ~4s total sleep overhead
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Control flow correctness: Verify that batching logic preserves the order of operations and doesn't introduce race conditions
  • Sleep duration validation: Confirm 500ms is sufficient for LSP server processing without missing incident detection
  • Symbol query timing: Ensure single symbol query after batch completes captures all declarations correctly
  • File lifecycle: Validate that consolidated cleanup properly closes all opened file handles

Poem

🐰 Once files slept two seconds each, oh what a costly spree,
But now we batch them smartly, and sleep just 500ms, you see!
From twenty minutes drowsy to five minutes bright and spry,
The nodejs provider hops with glee as performance soars up high! ✨

Pre-merge checks and finishing touches

✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: optimizing NodeJS provider file processing by eliminating sleep bottlenecks, and references the linked issue #969.
Linked Issues check ✅ Passed The PR successfully addresses all coding requirements from issue #969: eliminates per-file 2-second sleep, consolidates file opening into single loop, implements single 500ms post-indexing sleep, reduces GetAllDeclarations calls from 18 to 1, achieving ~91% performance improvement matching the issue's goals.
Out of Scope Changes check ✅ Passed All changes focus on consolidating file processing flow in service_client.go with no alterations to exported entities or unrelated modifications; changes are directly scoped to resolving the identified sleep bottleneck issue.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/nodejs-provider-sleep-bottleneck

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@tsanders-rh tsanders-rh changed the title perf: Optimize NodeJS provider file processing (fixes #969) 🐛 perf: Optimize NodeJS provider file processing (fixes #969) Nov 17, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (2)

218-225: Make the post-open delay ctx-aware and less “magic”

The single 500ms sleep is a big improvement over 2s per file but is still a hard-coded magic value and ignores ctx cancellation. You could keep behavior while improving robustness by using a ctx-aware wait and a named constant:

-	// Sleep once after all files are opened to allow LSP server to process
-	// all didOpen notifications before querying for symbols.
-	// This prevents the race condition without requiring sleep before each file.
-	time.Sleep(500 * time.Millisecond)
+	const indexWait = 500 * time.Millisecond
+	// Sleep once after all files are opened to allow the LSP server to process
+	// didOpen notifications before querying for symbols.
+	select {
+	case <-time.After(indexWait):
+	case <-ctx.Done():
+		return resp{}, ctx.Err()
+	}

Optionally, indexWait could later be made configurable if you see variability across environments.


231-235: Close-all loop is correct; consider future tracking of successfully opened files

Closing all URIs in one pass is simpler and should be fine given the preceding logic. If you ever hit partial failures in the open loop, a small enhancement would be to track which files actually succeeded in didOpen and only close those, but that’s not blocking for this PR.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5d44057 and 4290514.

📒 Files selected for processing (1)
  • external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (1)
provider/provider.go (1)
  • ProviderEvaluateResponse (303-307)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: test
  • GitHub Check: test (windows-latest, windows, amd64)
  • GitHub Check: test (macos-latest, darwin, arm64)
  • GitHub Check: test (macos-latest, darwin, amd64)
  • GitHub Check: test (ubuntu-latest, linux, amd64)
  • GitHub Check: test (ubuntu-latest, linux, arm64)
  • GitHub Check: benchmark (macos-latest, mac)
  • GitHub Check: benchmark (windows-latest, windows)
  • GitHub Check: benchmark (ubuntu-latest, linux)
  • GitHub Check: test (windows-latest, windows, amd64)
  • GitHub Check: test (ubuntu-latest, linux, arm64)
  • GitHub Check: test (ubuntu-latest, linux, amd64)
  • GitHub Check: test (macos-latest, darwin, amd64)
  • GitHub Check: test (macos-latest, darwin, arm64)
🔇 Additional comments (1)
external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (1)

203-215: Open-all didOpen loop achieves the intended perf improvement

Sequentially reading each file and issuing didOpen once per entry is straightforward and removes the per-file sleep bottleneck while preserving existing error handling. This looks correct and aligned with the PR’s goals.

@shawn-hurley
Copy link
Contributor

Have we tested this with the language server that is being shipped with the generic provider?

Just to be clear I am not opposed to merging, just want to make sure that the kantra/hub results are not impacted

@shawn-hurley
Copy link
Contributor

Looks like the demo test, does in fact have nodejs references that are still being found and passing the test. This gives me some comfort that it is not fully breaking those results, so merging now.

@shawn-hurley shawn-hurley merged commit 8c000cb into main Nov 17, 2025
24 of 25 checks passed
tsanders-rh added a commit to tsanders-rh/analyzer-lsp that referenced this pull request Nov 20, 2025
Resolved conflict in nodejs/service_client.go by keeping the import-based
search implementation which supersedes the previous file batching optimization
from PR konveyor#970.

The import-based search provides better semantic accuracy by:
- Finding imports first (regex scan)
- Using LSP definitions to locate symbols
- Using LSP references to find all usages

This is fundamentally different from and more accurate than the workspace/symbol
approach that was being optimized in the upstream changes.

Signed-off-by: Todd Sanders <[email protected]>
Signed-off-by: tsanders <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

perf: NodeJS provider 2-second sleep causes 20+ minute delays on large codebases

4 participants