Skip to content

Conversation

@eemcmullan
Copy link
Contributor

@eemcmullan eemcmullan commented Aug 26, 2025

Summary by CodeRabbit

  • Bug Fixes
    • Corrected artifact path resolution for exploded JARs, ensuring proper local repository layout and copy destinations.
  • Reliability
    • More robust handling of Maven search outages and non-OK responses with clearer errors, reducing noisy failures and stabilizing dependency retrieval.
  • Performance
    • Caches server-side Maven search errors to avoid repeated lookups, reducing unnecessary network calls and speeding up repeated operations.

Signed-off-by: Emily McMullan <[email protected]>
@coderabbitai
Copy link

coderabbitai bot commented Aug 26, 2025

Walkthrough

Updates in util.go adjust artifact path derivation during JAR explode and add Maven search error caching. constructArtifactFromSHA now short-circuits on cached errors, handles non-200 responses explicitly, and caches 5xx errors. explode now derives artifactPath from the exploded JAR’s base filename rather than splitting dep.ArtifactId.

Changes

Cohort / File(s) Summary of Changes
Maven search robustness and artifact path derivation
external-providers/java-external-provider/pkg/java_external_provider/util.go
- explode: artifactPath now from base filename of exploded JAR (without ".jar"), not from dep.ArtifactId segments
- New package-level mavenSearchErrorCache to store last Maven 5xx error
- constructArtifactFromSHA: returns cached error if set; on HTTP non-200, returns explicit error; caches 5xx errors to avoid repeated lookups

Sequence Diagram(s)

sequenceDiagram
  participant C as Caller
  participant U as util.constructArtifactFromSHA
  participant Cache as mavenSearchErrorCache
  participant M as Maven Search API

  C->>U: constructArtifactFromSHA(sha)
  U->>Cache: Check cached error
  alt Cached error present
    U-->>C: Return cached error
  else No cached error
    U->>M: HTTP GET /search by SHA
    alt 200 OK
      U-->>C: Parse response, return artifact
    else 5xx Server Error
      U->>Cache: Store error
      U-->>C: Return error (server unavailable)
    else Non-200 (e.g., 4xx)
      U-->>C: Return error (status != 200)
    end
  end

  Note over U: explode() now derives artifactPath from JAR base filename
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

I thump my paws on jars that bloom,
A filename path dispels the gloom.
If Maven storms with 5xx rain,
I cache the clouds to skip the pain.
Hop, retry later—clear and bright—
Artifacts aligned, the build takes flight. 🐇✨

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@eemcmullan eemcmullan requested a review from jmle August 26, 2025 19:54
@eemcmullan eemcmullan added the cherry-pick/release-0.7 This PR should be cherry-picked to release-0.7 branch label Aug 26, 2025
@eemcmullan eemcmullan changed the title Cache Maven Search Err & Fix artifactID 🐛 Cache Maven Search Err & Fix artifactID Aug 26, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
external-providers/java-external-provider/pkg/java_external_provider/util.go (2)

409-418: Fix m2 layout: artifact directory should use dep.ArtifactId, not the JAR filename.

Using the exploded JAR’s base name (minus “.jar”) as artifactPath produces a non-standard Maven local-repo layout (e.g., artifact dir becomes foo-1.2.3 instead of foo). This breaks downstream resolution and can cause duplicate directories. Keep the dest filename as-is if you don’t know classifier, but the artifact directory must be dep.ArtifactId.

Apply this diff:

-				artifactPath, _ := strings.CutSuffix(filepath.Base(explodedFilePath), ".jar")
+				artifactPath := dep.ArtifactId
 				destPath := filepath.Join(m2Repo, groupPath, artifactPath,
 					dep.Version, filepath.Base(explodedFilePath))

556-580: Make the Maven-search error cache concurrency-safe and time-bound.

  • Data race: mavenSearchErrorCache is a package-level var read/written from potentially concurrent goroutines (multiple explode/decompile paths). Guard it with a mutex.
  • Poisoning: a single transient 5xx permanently disables lookups for the lifetime of the process. Add a short TTL (e.g., 5 minutes).
  • Backoff: also treat 429 (rate limit) as cacheable with TTL.

Proposed patch:

- var mavenSearchErrorCache error
+ var (
+ 	mavenSearchErrorMu     sync.RWMutex
+ 	mavenSearchErrorCache  error
+ 	mavenSearchErrorExpiry time.Time
+ )

@@
-	// if maven search is down, we do not want to keep trying on each dep
-	if mavenSearchErrorCache != nil {
-		log.Info("maven search is down, returning cached error", "error", mavenSearchErrorCache)
-		return dep, mavenSearchErrorCache
-	}
+	// if maven search is temporarily down, avoid repeated attempts within TTL
+	mavenSearchErrorMu.RLock()
+	cachedErr := mavenSearchErrorCache
+	expiry := mavenSearchErrorExpiry
+	mavenSearchErrorMu.RUnlock()
+	if cachedErr != nil && time.Now().Before(expiry) {
+		log.Info("maven search temporarily unavailable; returning cached error", "error", cachedErr, "expiresAt", expiry)
+		return dep, cachedErr
+	}

@@
-	if resp.StatusCode != http.StatusOK {
-		statusErr := fmt.Errorf("Maven search is unavailable: %s", resp.Status)
-		// cache the server errors
-		if resp.StatusCode >= 500 {
-			mavenSearchErrorCache = statusErr
-		}
-		return dep, statusErr
-	}
+	if resp.StatusCode != http.StatusOK {
+		statusErr := fmt.Errorf("Maven search is unavailable: %s", resp.Status)
+		// cache transient errors for a short period
+		if resp.StatusCode >= 500 || resp.StatusCode == http.StatusTooManyRequests {
+			mavenSearchErrorMu.Lock()
+			mavenSearchErrorCache = statusErr
+			mavenSearchErrorExpiry = time.Now().Add(5 * time.Minute)
+			mavenSearchErrorMu.Unlock()
+		}
+		return dep, statusErr
+	}

Note: This change stays consistent with the maintainers’ preference to continue even when Maven lookup fails; toDependency already falls back to POM/structure, and we’re only avoiding repeated remote calls.

Also applies to: 589-597

🧹 Nitpick comments (1)
external-providers/java-external-provider/pkg/java_external_provider/util.go (1)

581-587: Use an http.Client with timeout (and consider context) to avoid hanging on network calls.

http.Get has no timeout and isn’t cancellable via your outer context. Create a package-level http.Client with a sane timeout (e.g., 10s) and use it here. If you’re open to a small signature change later, threading a context.Context into constructArtifactFromSHA is ideal.

Minimal tweak:

+var httpClient = &http.Client{Timeout: 10 * time.Second}
@@
-	resp, err := http.Get(searchURL)
+	resp, err := httpClient.Get(searchURL)

I can also follow up with a refactor that threads context and makes the base URL injectable for unit tests (httptest).

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between feb8b99 and 9705675.

📒 Files selected for processing (1)
  • external-providers/java-external-provider/pkg/java_external_provider/util.go (4 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-07-30T12:11:45.673Z
Learnt from: pranavgaikwad
PR: konveyor/analyzer-lsp#859
File: external-providers/java-external-provider/pkg/java_external_provider/dependency.go:694-694
Timestamp: 2025-07-30T12:11:45.673Z
Learning: In the Java external provider dependency walker (external-providers/java-external-provider/pkg/java_external_provider/dependency.go), errors from toDependency function calls should be ignored as they are not considered important by the maintainers.

Applied to files:

  • external-providers/java-external-provider/pkg/java_external_provider/util.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: test (windows-latest)
  • GitHub Check: benchmark (macos-latest, mac)
  • GitHub Check: benchmark (ubuntu-latest, linux)
  • GitHub Check: benchmark (windows-latest, windows)
  • GitHub Check: test

Copy link
Member

@aufi aufi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can understand this looks little risky, but makes sense IMO. When there is a 5xx HTTP failure on given analysis run (which could be also 504 gateway timeout), maven central will be set to be skipped.

return dep, err
}

var mavenSearchErrorCache error
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add this as something to take care of in a rewrite. I think this will be fine for now but I kind of agree with Code Rabbit and that we might want to do something different in the future

@eemcmullan eemcmullan merged commit 8adf035 into konveyor:main Aug 27, 2025
19 of 22 checks passed
github-actions bot pushed a commit that referenced this pull request Aug 27, 2025
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- Bug Fixes
- Corrected artifact path resolution for exploded JARs, ensuring proper
local repository layout and copy destinations.
- Reliability
- More robust handling of Maven search outages and non-OK responses with
clearer errors, reducing noisy failures and stabilizing dependency
retrieval.
- Performance
- Caches server-side Maven search errors to avoid repeated lookups,
reducing unnecessary network calls and speeding up repeated operations.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Emily McMullan <[email protected]>
Signed-off-by: Cherry Picker <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cherry-pick/release-0.7 This PR should be cherry-picked to release-0.7 branch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants