-
Notifications
You must be signed in to change notification settings - Fork 53
🐛 Set foundOnline also when pom is found in dependency #880
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
We were setting the foundOnline property of the struct only when building the artifact from inferring the structure of the JAR, but not when doing so through the pom, therefore setting a lot of libraries as internal, even though later in the execution they were tagged as open source in a different part of the code Signed-off-by: Juan Manuel Leflet Estrada <[email protected]>
WalkthroughPre-marks dependencies as online when a GroupId wildcard key exists in depToLabels and propagates that into addDepLabels; enriches artifacts discovered from jar structure with GroupId, ArtifactId, Version="Unknown" and marks them foundOnline; adds a log message when Maven search is disabled; stores disableMavenSearch on the javaServiceClient. Changes
Sequence Diagram(s)sequenceDiagram
participant Caller
participant constructArtifactFromPom as constructArtifactFromPom
participant addDepLabels as addDepLabels
Caller->>constructArtifactFromPom: process dependency (dep, depToLabels)
constructArtifactFromPom->>constructArtifactFromPom: compute depName and groupIdRegex (GroupId.*)
alt depToLabels contains groupIdRegex
constructArtifactFromPom->>constructArtifactFromPom: dep.foundOnline = true
else
constructArtifactFromPom->>constructArtifactFromPom: dep.foundOnline remains false
end
constructArtifactFromPom->>addDepLabels: addDepLabels(depToLabels, depName, dep.foundOnline)
addDepLabels-->>constructArtifactFromPom: labels
constructArtifactFromPom-->>Caller: artifact with labels
sequenceDiagram
participant Caller
participant constructArtifactFromStructure as constructArtifactFromStructure
Caller->>constructArtifactFromStructure: inspect jar filesystem/structure
constructArtifactFromStructure->>constructArtifactFromStructure: detect public groupId via regex
alt public groupId matched
constructArtifactFromStructure->>constructArtifactFromStructure: set artifact.GroupId, ArtifactId (basename), Version="Unknown"
constructArtifactFromStructure->>constructArtifactFromStructure: artifact.foundOnline = true
constructArtifactFromStructure-->>Caller: return populated artifact
else
constructArtifactFromStructure-->>Caller: continue existing discovery flow
end
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🔭 Outside diff range comments (1)
external-providers/java-external-provider/pkg/java_external_provider/util.go (1)
654-667: Handle wildcard/prefix keys and avoid leaking foundOnline=true on internal labels
- Exact key check depToLabels[dep.GroupId] will miss wildcard entries like "com.foo.*" (see Lines 686-701 where wildcard/prefix matching is used). This weakens the intended fix for many groups.
- If an internal label is returned by addDepLabels, foundOnline should be explicitly reset to false; otherwise the pre-set true may persist.
Apply this diff to make the pre-check robust and to ensure internal labels override the pre-set value. Also improves readability by avoiding the l variable shadowing.
- if depToLabels[dep.GroupId] != nil { - dep.foundOnline = true - } - l := addDepLabels(depToLabels, depName, dep.foundOnline) - for _, l := range l { - if l == labels.AsString(provider.DepSourceLabel, javaDepSourceOpenSource) { - // Setting here to make things easier. - dep.foundOnline = true - break - } - if l == labels.AsString(provider.DepSourceLabel, javaDepSourceInternal) { - break - } - } + if isPublicGroup(depToLabels, dep.GroupId) { + dep.foundOnline = true + } + labelsList := addDepLabels(depToLabels, depName, dep.foundOnline) + for _, lbl := range labelsList { + if lbl == labels.AsString(provider.DepSourceLabel, javaDepSourceOpenSource) { + dep.foundOnline = true + break + } + if lbl == labels.AsString(provider.DepSourceLabel, javaDepSourceInternal) { + // Ensure internal label negates any optimistic pre-set + dep.foundOnline = false + break + } + }Add this helper outside the function (e.g., near related utilities). It mirrors the prefix/wildcard lookup used in constructArtifactFromStructure:
// isPublicGroup returns true if depToLabels contains an entry that matches the groupId, // supporting exact match, "groupId.*", and progressive prefix reductions. func isPublicGroup(depToLabels map[string]*depLabelItem, groupId string) bool { if groupId == "" || depToLabels == nil { return false } if depToLabels[groupId] != nil { return true } sgmts := strings.Split(groupId, ".") for len(sgmts) > 0 { candidate := strings.Join(append(sgmts, "*"), ".") if depToLabels[candidate] != nil { return true } sgmts = sgmts[:len(sgmts)-1] } return false }
🧹 Nitpick comments (2)
external-providers/java-external-provider/pkg/java_external_provider/util.go (2)
641-651: Check scanner.Err() after scanning pom.propertiesScanner errors (e.g., truncated ZIP entry) are dropped. Add an error check to avoid silently returning partial coordinates.
for scanner.Scan() { line := scanner.Text() if strings.HasPrefix(line, "version=") { dep.Version = strings.TrimSpace(strings.TrimPrefix(line, "version=")) } else if strings.HasPrefix(line, "artifactId=") { dep.ArtifactId = strings.TrimSpace(strings.TrimPrefix(line, "artifactId=")) } else if strings.HasPrefix(line, "groupId=") { dep.GroupId = strings.TrimSpace(strings.TrimPrefix(line, "groupId=")) } } + if err := scanner.Err(); err != nil { + return dep, err + }
702-703: Fix logging message formattingfmt.Sprintf is passed jarFile without a %s placeholder; the file name never appears in the log message.
- log.V(10).Info(fmt.Sprintf("could not find groupId for in our public listing of group id's", jarFile)) + log.V(10).Info(fmt.Sprintf("could not find groupId in our public listing of group ids: %s", jarFile))
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
external-providers/java-external-provider/pkg/java_external_provider/util.go(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: test
- GitHub Check: benchmark (ubuntu-latest, linux)
- GitHub Check: benchmark (windows-latest, windows)
- GitHub Check: benchmark (macos-latest, mac)
- GitHub Check: test (windows-latest)
Signed-off-by: Juan Manuel Leflet Estrada <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (1)
external-providers/java-external-provider/pkg/java_external_provider/util.go (1)
659-659: Avoid shadowing by reusing the name “l” for both the slice and loop variableMinor readability nit:
l := addDepLabels(...)followed byfor _, l := range lshadows the slice variable with the loop variable. Rename one of them (e.g.,labels := addDepLabels(...)andfor _, lbl := range labels).
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these settings in your CodeRabbit configuration.
📒 Files selected for processing (1)
external-providers/java-external-provider/pkg/java_external_provider/util.go(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: test
- GitHub Check: benchmark (macos-latest, mac)
- GitHub Check: benchmark (windows-latest, windows)
- GitHub Check: test (windows-latest)
- GitHub Check: test (macos-latest)
Signed-off-by: Juan Manuel Leflet Estrada <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
external-providers/java-external-provider/pkg/java_external_provider/util.go (1)
654-658: Guard against empty groupId and use comma-ok map check (prevents ".*" wildcard false positives)If groupId is empty, the current code produces ".*" and may incorrectly set foundOnline when depToLabels has a catch‑all. Also prefer the map presence check.
Apply this diff:
- groupIdRegex := strings.Join([]string{dep.GroupId, "*"}, ".") - if depToLabels[groupIdRegex] != nil { - dep.foundOnline = true - } + if dep.GroupId != "" { + groupIdRegex := strings.Join([]string{dep.GroupId, "*"}, ".") + if _, ok := depToLabels[groupIdRegex]; ok { + dep.foundOnline = true + } + } l := addDepLabels(depToLabels, depName, dep.foundOnline)
🧹 Nitpick comments (2)
external-providers/java-external-provider/pkg/java_external_provider/util.go (2)
658-669: Avoid variable shadowing for readability in label iterationThe slice variable l is immediately shadowed by the range variable l. Rename for clarity.
- l := addDepLabels(depToLabels, depName, dep.foundOnline) - for _, l := range l { - if l == labels.AsString(provider.DepSourceLabel, javaDepSourceOpenSource) { + labelsList := addDepLabels(depToLabels, depName, dep.foundOnline) + for _, lab := range labelsList { + if lab == labels.AsString(provider.DepSourceLabel, javaDepSourceOpenSource) { // Setting here to make things easier. dep.foundOnline = true break } - if l == labels.AsString(provider.DepSourceLabel, javaDepSourceInternal) { + if lab == labels.AsString(provider.DepSourceLabel, javaDepSourceInternal) { break } }
705-706: Fix log message formattingjarFile is ignored because the format string lacks a placeholder.
Replace with:
log.V(10).Info(fmt.Sprintf("could not find groupId for %s in our public listing of group id's", jarFile))
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these settings in your CodeRabbit configuration.
📒 Files selected for processing (1)
external-providers/java-external-provider/pkg/java_external_provider/util.go(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: test
- GitHub Check: benchmark (windows-latest, windows)
- GitHub Check: benchmark (macos-latest, mac)
- GitHub Check: test (windows-latest)
🔇 Additional comments (1)
external-providers/java-external-provider/pkg/java_external_provider/util.go (1)
689-703: Guard empty groupId to avoid ".*" wildcard matches; simplify control flowDon't build a "*. " pattern when groupId is empty and prefer a map-presence check; move the segment-truncation out of the else to simplify flow.
- File: external-providers/java-external-provider/pkg/java_external_provider/util.go — lines 689–703
- // check against depToLabels. add * - groupIdRegex := strings.Join([]string{groupId, "*"}, ".") - if depToLabels[groupIdRegex] != nil { - log.V(10).Info(fmt.Sprintf("%s is a public dependency with a group id of: %s", jarFile, groupId)) - // do a best effort to set some dependency data - artifact.GroupId = groupId - artifact.ArtifactId = strings.TrimSuffix(filepath.Base(jarFile), ".jar") - // Adding this back to make some things easier. - artifact.foundOnline = true - return artifact, nil - } else { - // lets try to remove one segment from the end - sgmts = sgmts[:len(sgmts)-1] - groupId = strings.Join(sgmts, ".") - groupIdRegex = strings.Join([]string{groupId, "*"}, ".") - } + // check against depToLabels. add * + groupIdRegex := strings.Join([]string{groupId, "*"}, ".") + if groupId != "" { + if _, ok := depToLabels[groupIdRegex]; ok { + log.V(10).Info(fmt.Sprintf("%s is a public dependency with a group id of: %s", jarFile, groupId)) + // do a best effort to set some dependency data + artifact.GroupId = groupId + artifact.ArtifactId = strings.TrimSuffix(filepath.Base(jarFile), ".jar") + // Adding this back to make some things easier. + artifact.foundOnline = true + return artifact, nil + } + } + // lets try to remove one segment from the end + sgmts = sgmts[:len(sgmts)-1] + groupId = strings.Join(sgmts, ".") + groupIdRegex = strings.Join([]string{groupId, "*"}, ".")Note: artifact.Version remains empty here, so isValid() may be false and explode() could drop foundOnline. If the intent is to treat this as a "public but incomplete" detection (e.g., to skip decompilation or mark as public), confirm and either populate Version or adjust explode() to respect foundOnline (or add an explicit "public-but-incomplete" flag).
| if depToLabels[groupIdRegex] != nil { | ||
| dep.foundOnline = true | ||
| } | ||
| l := addDepLabels(depToLabels, depName, dep.foundOnline) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Confirm addDepLabels(bool) semantics; fix to avoid returning both internal and open-source labels
Passing dep.foundOnline into addDepLabels changes label resolution. Today, addDepLabels can build its return slice before applying internal/open-source reconciliation, which may return both labels. Ensure it updates the matched-label map first, then builds the slice so source labels are mutually exclusive.
Run this to inspect the function and call sites:
Suggested fix for addDepLabels (in dependency.go): apply source-label decision on the map first, then build the slice; append language label last.
// Pseudocode outline
func addDepLabels(depToLabels map[string]*depLabelItem, depName string, openSource bool) []string {
m := matchLabels(depToLabels, depName) // existing logic that builds a map[string]struct{} of matched labels
open := labels.AsString(provider.DepSourceLabel, javaDepSourceOpenSource)
internal := labels.AsString(provider.DepSourceLabel, javaDepSourceInternal)
// Reconcile source labels first based on openSource flag
if openSource {
m[open] = struct{}{}
delete(m, internal)
} else {
// prefer internal unless caller already marked as open explicitly elsewhere
if _, hasOpen := m[open]; !hasOpen {
m[internal] = struct{}{}
}
}
// Now construct the slice from m to avoid returning both
s := make([]string, 0, len(m)+1)
for k := range m {
s = append(s, k)
}
// Append language label last (existing behavior)
s = append(s, labels.AsString(provider.LanguageLabel, "java"))
return s
}If you prefer, I can submit the exact patch once I see the current addDepLabels body.
🏁 Script executed:
#!/bin/bash
# Show definition and callers
rg -n --no-heading 'func\s+addDepLabels\s*\(' -A 80
rg -n --no-heading '\baddDepLabels\s*\(' -A 2Length of output: 12163
Fix addDepLabels to reconcile source labels before building the result slice
addDepLabels currently builds the slice of labels from the matched-label map before reconciling open-source vs internal source labels, so a dependency can end up with both labels. Reconcile the source labels on the map first, then build the slice (language label last).
Files to update:
- external-providers/java-external-provider/pkg/java_external_provider/dependency.go — func addDepLabels (around line 872)
Call sites to be aware of (no change required there, just for context): - external-providers/java-external-provider/pkg/java_external_provider/util.go:658
- external-providers/java-external-provider/pkg/java_external_provider/dependency.go:700, 733, 818
Suggested replacement for addDepLabels:
func addDepLabels(depToLabels map[string]*depLabelItem, depName string, openSource bool) []string {
m := map[string]interface{}{}
for _, d := range depToLabels {
if d.r.Match([]byte(depName)) {
for label := range d.labels {
m[label] = nil
}
}
}
openLabel := labels.AsString(provider.DepSourceLabel, javaDepSourceOpenSource)
internalLabel := labels.AsString(provider.DepSourceLabel, javaDepSourceInternal)
_, openSourceLabelFound := m[openLabel]
_, internalSourceLabelFound := m[internalLabel]
// Reconcile source labels on the map first so they remain mutually exclusive
if openSourceLabelFound || openSource {
m[openLabel] = nil
delete(m, internalLabel)
} else {
if !internalSourceLabelFound {
m[internalLabel] = nil
}
}
// Now build the slice from the reconciled map
s := make([]string, 0, len(m)+1)
for k := range m {
s = append(s, k)
}
// Append language label last (existing behavior)
s = append(s, labels.AsString(provider.DepLanguageLabel, "java"))
return s
}🤖 Prompt for AI Agents
In
external-providers/java-external-provider/pkg/java_external_provider/dependency.go
around line 872, addDepLabels currently builds the result slice from the
matched-label map before reconciling open-source vs internal source labels
causing a dependency to sometimes have both source labels; change the function
to first collect matching labels into a temporary map, check for the open and
internal source labels and reconcile them so they are mutually exclusive (prefer
open if present or if openSource is true, otherwise ensure internal is present),
then build the slice from the reconciled map and append the language label
("java") last.
Signed-off-by: Juan Manuel Leflet Estrada <[email protected]>
Signed-off-by: Emily McMullan <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🔭 Outside diff range comments (2)
external-providers/java-external-provider/pkg/java_external_provider/provider.go (2)
233-241: Initialize disableMavenSearch for RPC-based clients as wellIn the RPC-initialization path, the javaServiceClient is created without disableMavenSearch set. If util.go/other logic relies on s.disableMavenSearch for behavior/logging, RPC-based clients will silently behave as if the flag is false even when configured.
Proposed change:
- if config.RPC != nil { - return &javaServiceClient{ - rpc: config.RPC, - config: config, - log: log, - depsLocationCache: make(map[string]int), - includedPaths: provider.GetIncludedPathsFromConfig(config, false), - }, provider.InitConfig{}, nil - } + if config.RPC != nil { + disableMavenSearch, _ := config.ProviderSpecificConfig[DISABLE_MAVEN_SEARCH].(bool) + return &javaServiceClient{ + rpc: config.RPC, + config: config, + log: log, + depsLocationCache: make(map[string]int), + includedPaths: provider.GetIncludedPathsFromConfig(config, false), + disableMavenSearch: disableMavenSearch, + }, provider.InitConfig{}, nil + }
646-649: Empty error message when JAVA8_HOME is unsetReturning fmt.Errorf("") loses valuable context and makes troubleshooting difficult.
Suggested fix:
- if java8home == "" { - return fmt.Errorf("") - } + if java8home == "" { + return fmt.Errorf("JAVA8_HOME is not set; Gradle wrapper requires Java 8 to resolve sources") + }
🧹 Nitpick comments (2)
external-providers/java-external-provider/pkg/java_external_provider/provider.go (2)
153-153: Typo in log message"goinag" -> "going".
- p.Log.Error(err, "this is not goinag to be cool if it fails") + p.Log.Error(err, "this is not going to be cool if it fails")
497-501: Typo in comment"implment" -> "implement".
- // We need the read to be at the same level of selection to fully implment graceful shutdown + // We need the read to be at the same level of selection to fully implement graceful shutdown
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these settings in your CodeRabbit configuration.
📒 Files selected for processing (2)
external-providers/java-external-provider/pkg/java_external_provider/provider.go(1 hunks)external-providers/java-external-provider/pkg/java_external_provider/util.go(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- external-providers/java-external-provider/pkg/java_external_provider/util.go
🧰 Additional context used
🧬 Code Graph Analysis (1)
external-providers/java-external-provider/pkg/java_external_provider/provider.go (1)
provider/lib.go (1)
GetIncludedPathsFromConfig(380-403)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: test
- GitHub Check: benchmark (windows-latest, windows)
- GitHub Check: test (windows-latest)
- GitHub Check: test (macos-latest)
🔇 Additional comments (1)
external-providers/java-external-provider/pkg/java_external_provider/provider.go (1)
515-533: Propagating disableMavenSearch in javaServiceClient: LGTMStoring disableMavenSearch on the service client makes the flag available to downstream logic without recomputation. This aligns with the PR objective and the related updates in util.go.
We were setting the
foundOnlineproperty of the struct only when building the artifact from inferring the structure of the JAR, but not when doing so through the pom, therefore setting a lot of libraries as internal, even though later in the execution they were tagged as open source in a different part of the code.Summary by CodeRabbit
Bug Fixes
Chores