-
Notifications
You must be signed in to change notification settings - Fork 53
🐛 Cache Maven Search Err & Fix artifactID #893
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Emily McMullan <[email protected]>
WalkthroughUpdates in util.go adjust artifact path derivation during JAR explode and add Maven search error caching. constructArtifactFromSHA now short-circuits on cached errors, handles non-200 responses explicitly, and caches 5xx errors. explode now derives artifactPath from the exploded JAR’s base filename rather than splitting dep.ArtifactId. Changes
Sequence Diagram(s)sequenceDiagram
participant C as Caller
participant U as util.constructArtifactFromSHA
participant Cache as mavenSearchErrorCache
participant M as Maven Search API
C->>U: constructArtifactFromSHA(sha)
U->>Cache: Check cached error
alt Cached error present
U-->>C: Return cached error
else No cached error
U->>M: HTTP GET /search by SHA
alt 200 OK
U-->>C: Parse response, return artifact
else 5xx Server Error
U->>Cache: Store error
U-->>C: Return error (server unavailable)
else Non-200 (e.g., 4xx)
U-->>C: Return error (status != 200)
end
end
Note over U: explode() now derives artifactPath from JAR base filename
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
external-providers/java-external-provider/pkg/java_external_provider/util.go (2)
409-418: Fix m2 layout: artifact directory should use dep.ArtifactId, not the JAR filename.Using the exploded JAR’s base name (minus “.jar”) as
artifactPathproduces a non-standard Maven local-repo layout (e.g., artifact dir becomesfoo-1.2.3instead offoo). This breaks downstream resolution and can cause duplicate directories. Keep the dest filename as-is if you don’t know classifier, but the artifact directory must bedep.ArtifactId.Apply this diff:
- artifactPath, _ := strings.CutSuffix(filepath.Base(explodedFilePath), ".jar") + artifactPath := dep.ArtifactId destPath := filepath.Join(m2Repo, groupPath, artifactPath, dep.Version, filepath.Base(explodedFilePath))
556-580: Make the Maven-search error cache concurrency-safe and time-bound.
- Data race:
mavenSearchErrorCacheis a package-level var read/written from potentially concurrent goroutines (multiple explode/decompile paths). Guard it with a mutex.- Poisoning: a single transient 5xx permanently disables lookups for the lifetime of the process. Add a short TTL (e.g., 5 minutes).
- Backoff: also treat 429 (rate limit) as cacheable with TTL.
Proposed patch:
- var mavenSearchErrorCache error + var ( + mavenSearchErrorMu sync.RWMutex + mavenSearchErrorCache error + mavenSearchErrorExpiry time.Time + ) @@ - // if maven search is down, we do not want to keep trying on each dep - if mavenSearchErrorCache != nil { - log.Info("maven search is down, returning cached error", "error", mavenSearchErrorCache) - return dep, mavenSearchErrorCache - } + // if maven search is temporarily down, avoid repeated attempts within TTL + mavenSearchErrorMu.RLock() + cachedErr := mavenSearchErrorCache + expiry := mavenSearchErrorExpiry + mavenSearchErrorMu.RUnlock() + if cachedErr != nil && time.Now().Before(expiry) { + log.Info("maven search temporarily unavailable; returning cached error", "error", cachedErr, "expiresAt", expiry) + return dep, cachedErr + } @@ - if resp.StatusCode != http.StatusOK { - statusErr := fmt.Errorf("Maven search is unavailable: %s", resp.Status) - // cache the server errors - if resp.StatusCode >= 500 { - mavenSearchErrorCache = statusErr - } - return dep, statusErr - } + if resp.StatusCode != http.StatusOK { + statusErr := fmt.Errorf("Maven search is unavailable: %s", resp.Status) + // cache transient errors for a short period + if resp.StatusCode >= 500 || resp.StatusCode == http.StatusTooManyRequests { + mavenSearchErrorMu.Lock() + mavenSearchErrorCache = statusErr + mavenSearchErrorExpiry = time.Now().Add(5 * time.Minute) + mavenSearchErrorMu.Unlock() + } + return dep, statusErr + }Note: This change stays consistent with the maintainers’ preference to continue even when Maven lookup fails;
toDependencyalready falls back to POM/structure, and we’re only avoiding repeated remote calls.Also applies to: 589-597
🧹 Nitpick comments (1)
external-providers/java-external-provider/pkg/java_external_provider/util.go (1)
581-587: Use an http.Client with timeout (and consider context) to avoid hanging on network calls.
http.Gethas no timeout and isn’t cancellable via your outer context. Create a package-levelhttp.Clientwith a sane timeout (e.g., 10s) and use it here. If you’re open to a small signature change later, threading acontext.ContextintoconstructArtifactFromSHAis ideal.Minimal tweak:
+var httpClient = &http.Client{Timeout: 10 * time.Second} @@ - resp, err := http.Get(searchURL) + resp, err := httpClient.Get(searchURL)I can also follow up with a refactor that threads context and makes the base URL injectable for unit tests (httptest).
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
external-providers/java-external-provider/pkg/java_external_provider/util.go(4 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-07-30T12:11:45.673Z
Learnt from: pranavgaikwad
PR: konveyor/analyzer-lsp#859
File: external-providers/java-external-provider/pkg/java_external_provider/dependency.go:694-694
Timestamp: 2025-07-30T12:11:45.673Z
Learning: In the Java external provider dependency walker (external-providers/java-external-provider/pkg/java_external_provider/dependency.go), errors from toDependency function calls should be ignored as they are not considered important by the maintainers.
Applied to files:
external-providers/java-external-provider/pkg/java_external_provider/util.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: test (windows-latest)
- GitHub Check: benchmark (macos-latest, mac)
- GitHub Check: benchmark (ubuntu-latest, linux)
- GitHub Check: benchmark (windows-latest, windows)
- GitHub Check: test
external-providers/java-external-provider/pkg/java_external_provider/util.go
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can understand this looks little risky, but makes sense IMO. When there is a 5xx HTTP failure on given analysis run (which could be also 504 gateway timeout), maven central will be set to be skipped.
| return dep, err | ||
| } | ||
|
|
||
| var mavenSearchErrorCache error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add this as something to take care of in a rewrite. I think this will be fine for now but I kind of agree with Code Rabbit and that we might want to do something different in the future
<!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Bug Fixes - Corrected artifact path resolution for exploded JARs, ensuring proper local repository layout and copy destinations. - Reliability - More robust handling of Maven search outages and non-OK responses with clearer errors, reducing noisy failures and stabilizing dependency retrieval. - Performance - Caches server-side Maven search errors to avoid repeated lookups, reducing unnecessary network calls and speeding up repeated operations. <!-- end of auto-generated comment: release notes by coderabbit.ai --> Signed-off-by: Emily McMullan <[email protected]> Signed-off-by: Cherry Picker <[email protected]>
Summary by CodeRabbit