🐛 perf: Optimize NodeJS provider file processing (fixes #969) #970

tsanders-rh · 2025-11-17T02:19:44Z

Summary

Eliminates dual bottlenecks in TypeScript/JavaScript file analysis that caused 20-25 minute delays on large codebases.

Fixes #969

Problem

The NodeJS provider had two major bottlenecks:

Sleep bottleneck: 2-second sleep before EACH file
- 582 files × 2s = 1,164 seconds (~19 minutes) of pure waiting
Query bottleneck: GetAllDeclarations called 18 times
- Once per 32-file batch instead of once total
- Each call queries the entire workspace for symbols (expensive operation)

Solution

Modified external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go:

Before:

Batch processing (32 files at a time)
2-second sleep before each didOpen notification
GetAllDeclarations called after each batch (18 times total)

After:

Open all files in single loop (no per-file sleep)
Single 500ms sleep after all didOpen notifications sent
Call GetAllDeclarations once after all files are indexed
Simplified didClose loop

The single sleep after all files prevents the race condition (symbol queries happening before LSP server finishes indexing) without the massive overhead of sleeping before each file.

Performance Results

Test environment: tackle2-ui (582 TypeScript files, 56 PatternFly rules)

Metric	Before	After	Improvement
Total time	20-25 minutes	1m 43s	91% faster
Actual time	1200-1500s	103 seconds	12x speedup
Sleep overhead	1,164s	0.5s	99.96% reduction
Workspace queries	18 calls	1 call	94% reduction

Testing

✅ Output file generated successfully (213 KB, 2,982 lines)
✅ Violations detected correctly (PatternFly migration issues)
✅ No errors or warnings in execution
✅ Tested on both small and large TypeScript codebases

Additional Notes

This fix is even better than the original proposal because it eliminates both the sleep bottleneck AND the redundant workspace queries.

Summary by CodeRabbit

Bug Fixes
- Optimized file processing flow for improved performance and reliability by consolidating file operations into a streamlined single-pass approach with coordinated resource management.

Eliminates dual bottlenecks in TypeScript/JavaScript file analysis: **Problem:** 1. 2-second sleep before EACH file (582 files × 2s = 1,164s wasted) 2. GetAllDeclarations called 18 times (once per 32-file batch) **Solution:** - Open all files in single loop (no per-file sleep) - Sleep 500ms once after all didOpen notifications - Call GetAllDeclarations once after all files indexed - Simplified didClose loop **Performance Results:** Test: tackle2-ui (582 TypeScript files, 56 rules) - Before: 20-25 minutes (1200-1500s) - After: 1m 43s (103s) - Improvement: 91% faster (12x speedup) The single sleep after all files prevents race condition where symbol queries happen before LSP server finishes indexing, without the massive overhead of sleeping before each file. Signed-off-by: tsanders <[email protected]>

coderabbitai · 2025-11-17T02:20:05Z

Walkthrough

The NodeJS provider's file processing workflow is refactored to eliminate per-file 2-second sleeps. Instead of batching files incrementally with sleep-per-file, the system now opens all files in a single pass, applies one 500ms delay, queries symbols once, and closes all files in one consolidated pass.

Changes

Cohort / File(s)	Change Summary
NodeJS Service Client Control Flow `external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go`	Consolidated file processing and symbol querying: removed per-file 2-second sleeps before each `didOpen` notification, replaced with single 500ms sleep after all files are opened, collapsed symbol query into one operation, and unified file cleanup into single pass

Sequence Diagram

sequenceDiagram
    autonumber
    participant Client
    participant OldFlow as Old Flow (Per-File)
    participant NewFlow as New Flow (Batch)
    
    rect rgba(200, 100, 100, 0.2)
        Note over OldFlow: ❌ Incremental Batching
        loop For each file in batch
            OldFlow->>OldFlow: Sleep 2s
            OldFlow->>OldFlow: didOpen(file)
            OldFlow->>OldFlow: Query symbols
            OldFlow->>OldFlow: Close file
        end
        Note over OldFlow: 582 files × 2s = 19 min
    end
    
    rect rgba(100, 200, 100, 0.2)
        Note over NewFlow: ✅ Batch Processing
        NewFlow->>NewFlow: Open all files
        NewFlow->>NewFlow: Sleep 500ms (once)
        NewFlow->>NewFlow: Query all symbols
        NewFlow->>NewFlow: Evaluate
        NewFlow->>NewFlow: Close all files
        Note over NewFlow: ~4s total sleep overhead
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Control flow correctness: Verify that batching logic preserves the order of operations and doesn't introduce race conditions
Sleep duration validation: Confirm 500ms is sufficient for LSP server processing without missing incident detection
Symbol query timing: Ensure single symbol query after batch completes captures all declarations correctly
File lifecycle: Validate that consolidated cleanup properly closes all opened file handles

Poem

🐰 Once files slept two seconds each, oh what a costly spree,
But now we batch them smartly, and sleep just 500ms, you see!
From twenty minutes drowsy to five minutes bright and spry,
The nodejs provider hops with glee as performance soars up high! ✨

Pre-merge checks and finishing touches

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main change: optimizing NodeJS provider file processing by eliminating sleep bottlenecks, and references the linked issue #969.
Linked Issues check	✅ Passed	The PR successfully addresses all coding requirements from issue #969: eliminates per-file 2-second sleep, consolidates file opening into single loop, implements single 500ms post-indexing sleep, reduces GetAllDeclarations calls from 18 to 1, achieving ~91% performance improvement matching the issue's goals.
Out of Scope Changes check	✅ Passed	All changes focus on consolidating file processing flow in service_client.go with no alterations to exported entities or unrelated modifications; changes are directly scoped to resolving the identified sleep bottleneck issue.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix/nodejs-provider-sleep-bottleneck

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (2)
218-225: Make the post-open delay ctx-aware and less “magic”

The single 500ms sleep is a big improvement over 2s per file but is still a hard-coded magic value and ignores ctx cancellation. You could keep behavior while improving robustness by using a ctx-aware wait and a named constant:
-	// Sleep once after all files are opened to allow LSP server to process
-	// all didOpen notifications before querying for symbols.
-	// This prevents the race condition without requiring sleep before each file.
-	time.Sleep(500 * time.Millisecond)
+	const indexWait = 500 * time.Millisecond
+	// Sleep once after all files are opened to allow the LSP server to process
+	// didOpen notifications before querying for symbols.
+	select {
+	case <-time.After(indexWait):
+	case <-ctx.Done():
+		return resp{}, ctx.Err()
+	}
Optionally, indexWait could later be made configurable if you see variability across environments.

231-235: Close-all loop is correct; consider future tracking of successfully opened files

Closing all URIs in one pass is simpler and should be fine given the preceding logic. If you ever hit partial failures in the open loop, a small enhancement would be to track which files actually succeeded in didOpen and only close those, but that’s not blocking for this PR.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5d44057 and 4290514.

📒 Files selected for processing (1)

external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (1)

provider/provider.go (1)

ProviderEvaluateResponse (303-307)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)

GitHub Check: test
GitHub Check: test (windows-latest, windows, amd64)
GitHub Check: test (macos-latest, darwin, arm64)
GitHub Check: test (macos-latest, darwin, amd64)
GitHub Check: test (ubuntu-latest, linux, amd64)
GitHub Check: test (ubuntu-latest, linux, arm64)
GitHub Check: benchmark (macos-latest, mac)
GitHub Check: benchmark (windows-latest, windows)
GitHub Check: benchmark (ubuntu-latest, linux)
GitHub Check: test (windows-latest, windows, amd64)
GitHub Check: test (ubuntu-latest, linux, arm64)
GitHub Check: test (ubuntu-latest, linux, amd64)
GitHub Check: test (macos-latest, darwin, amd64)
GitHub Check: test (macos-latest, darwin, arm64)

🔇 Additional comments (1)

external-providers/generic-external-provider/pkg/server_configurations/nodejs/service_client.go (1)

203-215: Open-all didOpen loop achieves the intended perf improvement

Sequentially reading each file and issuing didOpen once per entry is straightforward and removes the per-file sleep bottleneck while preserving existing error handling. This looks correct and aligned with the PR’s goals.

shawn-hurley · 2025-11-17T13:34:36Z

Have we tested this with the language server that is being shipped with the generic provider?

Just to be clear I am not opposed to merging, just want to make sure that the kantra/hub results are not impacted

shawn-hurley · 2025-11-17T13:59:01Z

Looks like the demo test, does in fact have nodejs references that are still being found and passing the test. This gives me some comfort that it is not fully breaking those results, so merging now.

Resolved conflict in nodejs/service_client.go by keeping the import-based search implementation which supersedes the previous file batching optimization from PR konveyor#970. The import-based search provides better semantic accuracy by: - Finding imports first (regex scan) - Using LSP definitions to locate symbols - Using LSP references to find all usages This is fundamentally different from and more accurate than the workspace/symbol approach that was being optimized in the upstream changes. Signed-off-by: Todd Sanders <[email protected]> Signed-off-by: tsanders <[email protected]>

tsanders-rh changed the title ~~perf: Optimize NodeJS provider file processing (fixes #969)~~ 🐛 perf: Optimize NodeJS provider file processing (fixes #969) Nov 17, 2025

coderabbitai bot reviewed Nov 17, 2025

View reviewed changes

tsanders-rh requested review from pranavgaikwad and shawn-hurley November 17, 2025 11:52

pranavgaikwad approved these changes Nov 17, 2025

View reviewed changes

shawn-hurley merged commit 8c000cb into main Nov 17, 2025
24 of 25 checks passed

This was referenced Nov 18, 2025

operator konveyor-operator (0.9.0-alpha.1) k8s-operatorhub/community-operators#7102

Merged

operator konveyor-operator (0.9.0-alpha.1) redhat-openshift-ecosystem/community-operators-prod#8137

Merged

coderabbitai bot mentioned this pull request Nov 20, 2025

✨ Implement import-based search for nodejs.referenced capability #976

Draft

coderabbitai bot mentioned this pull request Nov 21, 2025

✨ Add Prepare() function to providers to run pre-evaluation warmup #975

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

🐛 perf: Optimize NodeJS provider file processing (fixes #969) #970

🐛 perf: Optimize NodeJS provider file processing (fixes #969) #970

Uh oh!

tsanders-rh commented Nov 17, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Nov 17, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

shawn-hurley commented Nov 17, 2025

Uh oh!

shawn-hurley commented Nov 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

🐛 perf: Optimize NodeJS provider file processing (fixes #969) #970

🐛 perf: Optimize NodeJS provider file processing (fixes #969) #970

Uh oh!

Conversation

tsanders-rh commented Nov 17, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Solution

Performance Results

Testing

Additional Notes

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Nov 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

shawn-hurley commented Nov 17, 2025

Uh oh!

shawn-hurley commented Nov 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

tsanders-rh commented Nov 17, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 17, 2025 •

edited

Loading