Skip to content

Conversation

@tsanders-rh
Copy link
Contributor

@tsanders-rh tsanders-rh commented Nov 8, 2025

Summary

Enhances kantra analysis with visual progress feedback and startup performance optimizations for both containerless and hybrid modes.

Depends on: #592 (Hybrid analysis mode)

Changes

Progress Bar Implementation

  • Containerless mode progress bar: Real-time visual feedback showing percentage, completion bar, and current rule being processed
  • Hybrid mode progress bar: Matches containerless UX with clean status messages and visual progress
  • Binary analysis support: Progress tracking for JAR/WAR/EAR file analysis
  • --no-progress flag: Disable progress bar for scripting/CI environments

Performance Optimizations

  • Parallel rule loading: Load multiple rulesets concurrently instead of sequentially
  • Parallel startup tasks: Run config validation, volume creation, and ruleset extraction concurrently
  • Significantly reduces startup time when using multiple rulesets

User Experience Improvements

  • Clean initialization messages: "Running source/binary analysis...", "✓ Initialized providers", "✓ Loaded N rules"
  • Cursor hiding/showing for cleaner progress rendering
  • Suppressed container command output (volume/network creation)
  • Fixed INFO message bleed that interfered with progress bar
  • "Analysis complete!" message with results section

Bug Fixes

  • Fixed timeout and race condition in binary analysis
  • Improved health check result handling

Example Output

Before:

INFO[0000] Running source code analysis
... verbose log output ...

After (with progress bar):

Running source code analysis...
✓ Initialized providers: java, builtin
✓ Loaded 45 rules

Progress: [████████████████████████████████] 100% (45/45) | Processing: quarkus-springboot
Analysis complete!

Results: 127 incidents across 8 files

Testing

  • ✅ Tested with Java source code analysis
  • ✅ Tested with binary (.war) analysis
  • ✅ Tested with multiple rulesets (parallel loading)
  • ✅ Tested --no-progress flag for scripting
  • ✅ Verified both containerless and hybrid modes

Dependencies

Summary by CodeRabbit

  • New Features

    • Live progress UI for provider init, decompilation, rule loading/execution and report generation; opt-out via --no-progress.
    • Hybrid in-process analysis with improved host/container path handling and parallel startup.
    • Binary input auto-detection with tailored progress/messages.
    • Option to publish container ports and capture command output for richer diagnostics.
  • Bug Fixes

    • Stronger provider health/startup checks, better error propagation and cleanup.
    • Wait-for-target/decompilation waits now support timeouts and clearer status reporting.
  • Docs

    • Updated hybrid health-check/readiness guidance.
  • Chores

    • Dependency bumps and logging improvements.

Document strategy for running analyzer-lsp on host with containerized
providers to eliminate macOS volume mount performance overhead.

Key changes:
- Run analyzer engine on host (direct file I/O)
- Keep providers in containers (isolation)
- Use network communication (localhost:PORT)
- Target: 3.7x performance improvement on macOS

Expected impact:
- Current containerized: 127s
- Target hybrid: ~40s (similar to containerless 34s)

Signed-off-by: tsanders <[email protected]>
Investigated analyzer-lsp provider architecture and confirmed that
network-based provider communication is already supported via the
provider.Config.Address field.

Key findings:
- Providers support both in-process and network (gRPC) modes
- Setting Address="localhost:PORT" enables network mode
- lib.GetProviderClient() automatically creates appropriate client
- No analyzer-lsp modifications needed

Implementation approach:
1. Start providers with --network=host and --port=9001
2. Create configs with Address="localhost:9001" (no BinaryPath)
3. Run analyzer on host (direct file I/O)
4. Analyzer connects to providers via gRPC

Expected performance: ~40s (vs current 127s)

Signed-off-by: tsanders <[email protected]>
Created proof-of-concept to test if providers can run with host network
and be accessed from the host process.

Results:
- Provider containers start successfully with --network=host
- Providers listen on localhost:PORT as expected
- lib.GetProviderClient() hangs when using Address field

Next step: Extract konveyor-analyzer binary from container and test
if it can connect to localhost providers.

Signed-off-by: tsanders <[email protected]>
This introduces a hybrid architecture that provides significant performance
improvements while maintaining the isolation benefits of containerized providers.

Key Changes:
- Add hybrid mode execution path when --run-local=false
- Analyzer binary runs natively on host for direct file I/O
- Providers run in containers with port publishing for isolation
- Auto-extract default rulesets from container on first run
- Support all 6 provider types (Java, Go, Python, NodeJS, Dotnet, Builtin)
- Remove old containerized mode (RunAnalysis, RunProviders functions)

Implementation:
- cmd/analyze.go: Add extractDefaultRulesets(), createHybridProviderSettings(),
  RunProvidersHostNetwork(), and RunAnalysisHybrid() functions
- pkg/container/container.go: Add port publishing support with WithPortPublish()
- cmd/command-context.go: Fix multi-provider bug in setProviders()

Testing:
- Add 8 comprehensive tests for hybrid provider settings generation
- Add 7 tests for multi-provider bug fix validation
- All existing tests pass

Performance:
- 2.84x faster than containerless mode (4.82s vs 13.67s)
- 26.3x faster than old containerized mode (127s vs 4.82s)
- No breaking changes - all existing commands work identically

Signed-off-by: tsanders <[email protected]>
Add comprehensive documentation for all hybrid mode functions with:
- Detailed descriptions of functionality
- Parameter and return value documentation
- Architecture and implementation details
- Execution flow for RunAnalysisHybrid
- Performance metrics and benefits
- Code examples and configuration details

Enhanced functions:
- extractDefaultRulesets: Document caching behavior and extraction process
- createHybridProviderSettings: List all 6 supported providers with configs
- RunProvidersHostNetwork: Explain port publishing for macOS Podman VM
- RunAnalysisHybrid: Complete execution flow with 9 detailed steps

These enhanced docstrings improve code maintainability and make the hybrid
architecture more understandable for contributors.

Signed-off-by: tsanders <[email protected]>
Key changes:
- New file: cmd/analyze-hybrid.go with network-based provider setup
  - setupJavaProviderHybrid(): Creates network client to containerized Java provider
  - setupBuiltinProviderHybrid(): Creates in-process builtin provider
  - RunAnalysisHybridInProcess(): Runs analyzer in-process like containerless

- Modified: cmd/analyze.go
  - Switch from RunAnalysisHybrid() to RunAnalysisHybridInProcess()
  - Uses in-process analyzer execution instead of external binary

Architecture:
- Analyzer runs in-process as Go library (like containerless mode)
- Providers run in containers with port publishing
- Network communication via localhost:PORT
- Clean console output with direct logging control

Benefits:
- Same clean output as containerless mode (no filteringWriter needed)
- Direct control over logging and progress reporting
- Maintains provider isolation in containers
- Reuses containerless code paths for analyzer execution

Discovery:
- java.NewJavaProvider() already supports network mode via config
- No wrapper class needed - just set Address and empty BinaryPath
- lib.GetProviderClient() creates builtin provider (public API)

Documentation:
- Added HYBRID_INPROCESS_IMPLEMENTATION.md with architecture details

Signed-off-by: tsanders <[email protected]>
Add comprehensive documentation explaining the hybrid mode architecture:

- HYBRID_ARCHITECTURE_PLAN.md: Documents the performance problem on macOS
  (3.7x degradation due to VM volume mounts) and the proposed hybrid
  solution (analyzer on host, providers in containers).

- HYBRID_MODE_STRATEGY.md: Explains the three-mode architecture strategy
  and when to use each mode:
  - Containerless: Maximum performance, requires local dependencies
  - Hybrid: Default mode, balances performance and isolation
  - Containerized: Maximum portability, no local dependencies needed

These documents provide important context for understanding why hybrid
mode was implemented and how it fits into kantra's overall architecture.

Signed-off-by: tsanders <[email protected]>
…s implementation

Remove the initial hybrid implementation that used an external analyzer binary
and replace it with cleaner in-process approach.

Removed:
- RunAnalysisHybrid() - external binary approach
- createHybridProviderSettings() - config generator for external binary
- analyze_hybrid_test.go - tests for removed functions
- unused 'time' import

Added:
- analyze_hybrid_inprocess_test.go - basic tests for new implementation
  - Tests for setupJavaProviderHybrid configuration
  - Tests for setupBuiltinProviderHybrid with/without proxy

Rationale:
The external binary approach was our first iteration. The in-process approach
(RunAnalysisHybridInProcess in analyze-hybrid.go) is superior because:
- Runs analyzer as Go library for direct control
- Same clean output as containerless mode
- No need for external binary or output filtering
- Network-based provider clients work seamlessly

This simplifies the codebase to a single, better hybrid implementation.

Signed-off-by: tsanders <[email protected]>
- Rename setupJavaProviderHybrid() to setupNetworkProvider()
- Add generic provider support (Java, Go, Python, NodeJS, Dotnet)
- Add provider-specific LSP configuration for each provider type
- Update RunAnalysisHybridInProcess() to loop through all providers
- Java uses java.NewJavaProvider(), others use lib.GetProviderClient()

This allows hybrid mode to support any combination of providers,
not just Java. All providers run in containers with network
communication to the in-process analyzer.

Signed-off-by: tsanders <[email protected]>
- Add HYBRID_MODE.md: Complete documentation covering architecture,
  usage, troubleshooting, performance, and production readiness
- Add HYBRID_MODE_QUICKSTART.md: Quick reference guide with common
  commands, examples, and troubleshooting tips
- Update README.md: Add analysis modes section explaining hybrid,
  containerless, and container modes with recommendations
- Fix --run-local flag description to clarify hybrid mode default

These docs provide users with:
- Clear explanation of when to use each mode
- Comprehensive troubleshooting guides
- Performance characteristics and benchmarks
- Production readiness checklist
- Quick reference for common tasks

Signed-off-by: tsanders <[email protected]>
This commit implements three critical production readiness features:

1. Provider Health Checks (replaces 4-second sleep)
   - Add waitForProvider() function with TCP port polling
   - Exponential backoff (100ms to 2s) for retry delays
   - 30-second timeout with context cancellation support
   - Detailed troubleshooting error messages on timeout

2. Improved Error Messages
   - Add context-specific error messages for all failure points
   - Include troubleshooting steps in error output
   - Specify exact commands to debug issues (podman ps, logs, lsof)
   - Differentiate between configuration, network, and container errors

3. Configuration Validation (before container start)
   - Validate Maven settings file exists if specified
   - Validate input path exists
   - Check provider ports are available (not already in use)
   - Early failure prevents wasting time on doomed container starts

Benefits:
- Faster startup (no waiting if provider is ready quickly)
- More reliable (detects when provider fails to start)
- Better user experience (clear error messages with solutions)
- Prevents common issues (port conflicts, missing files)

Related documentation: HYBRID_MODE.md Production Readiness section

Signed-off-by: tsanders <[email protected]>
Add automated benchmark script and results documentation:

Benchmark Results:
- Containerless Mode: Average 12,826ms (12.8 seconds)
- Hybrid Mode: Average 248ms (0.25 seconds)
- Performance: Hybrid is 51.71x FASTER on macOS

Key Findings:
- Podman VM I/O boundary is the major bottleneck in containerless
- In-process analyzer with network providers eliminates VM overhead
- Health check optimizations contribute to fast startup
- Scalability: Larger codebases show even greater speedup

Benchmark Script Features:
- Automated testing of both modes
- Multiple iterations for statistical validity
- Nanosecond precision timing
- Clean separation of display and data
- Automatic cleanup of containers and outputs

This validates that hybrid mode is the recommended default for
macOS users, with massive performance improvements over the
previous containerless mode.

Files:
- benchmark-modes.sh: Automated benchmark script
- BENCHMARK_RESULTS.md: Detailed analysis and findings

Signed-off-by: tsanders <[email protected]>
Hybrid mode was failing with nil pointer panic during provider
initialization. The issue has been resolved by properly handling
the Proxy configuration field.

Root cause:
- InitConfig.Proxy is *Proxy (pointer type)
- gRPC provider dereferences config.Proxy.HTTPProxy at line 203
- Passing nil caused panic

Key fixes:
1. Initialize proxyConfig as &provider.Proxy{} (pointer to empty struct)
2. Pass pointer directly to Proxy field in InitConfig
3. Use container-side paths for Java provider configuration
4. Change Location from host path (a.input) to container path
   (util.SourceMountPath)
5. Use lib.GetProviderClient() for all providers in network mode

Also added timing instrumentation to containerless mode for
performance benchmarking.

Signed-off-by: tsanders <[email protected]>
…provider

Problem:
Hybrid mode was missing 37% of violations (21 rules, 38 incidents) compared
to containerless mode. The builtin provider was failing to detect rules like
discover-java-files, discover-maven-xml, hardcoded-ip-address, and XML
analysis rules.

Root Cause:
When the Java provider runs in a container, it returns InitConfigs with
container paths (/opt/input/source). These configs are passed to the builtin
provider which runs on the host. The builtin provider cannot find files at
/opt/input/source because that path only exists inside the container.

Solution:
Transform Location field in additionalBuiltinConfigs from container paths to
host paths before passing them to the builtin provider. This ensures the
builtin provider can find and analyze all files.

Testing:
- Containerless: 64 violated rules
- Hybrid (before fix): 35 violated rules (-29 rules, -37%)
- Hybrid (after fix): 64 violated rules (0 missing rules)

All previously missing rules now detected:
- discover-java-files ✓
- discover-maven-xml ✓
- hardcoded-ip-address ✓
- javaee-pom-to-quarkus-* ✓
- xml-java-versions-* ✓

Signed-off-by: $(git config user.name) <$(git config user.email)>
Signed-off-by: tsanders <[email protected]>
Add documentation explaining that static HTML report generation fails
when using hybrid mode (--run-local=false) with output directories under
/tmp on macOS. This is a Podman Desktop limitation where /tmp paths
cannot be mounted to containers.

Key points documented:
- Analysis completes successfully, only HTML report generation fails
- Limitation only affects hybrid/container modes, not containerless
- Provides 3 workarounds: use non-/tmp paths, skip static report, or use containerless mode
- This is an existing limitation, not introduced by hybrid mode

The documentation is placed in the macOS setup section where users will
see it during initial configuration.

Signed-off-by: tsanders <[email protected]>
Kantra now has only two analysis modes:
1. Containerless (default, --run-local=true): Everything runs on host
2. Hybrid (--run-local=false): Analyzer on host, providers in containers

Changes:
- Removed --override-provider-settings flag and related code
- Deleted RunAnalysisOverrideProviderSettings() function (~100 lines)
- Deleted analyzeDotnetFramework() function (~287 lines)
- Removed overrideProviderSettings field from analyzeCommand struct
- Removed all validation and conditional logic for old mode
- Removed unused freeport import
- Updated README.md to reflect two modes instead of three
- Both modes tested and verified working correctly

The old containerized mode (where both analyzer and providers ran in
containers) has been completely removed in favor of the faster hybrid
mode which runs the analyzer in-process on the host.

Signed-off-by: tsanders <[email protected]>
Fixes two critical issues preventing binary (WAR/JAR/EAR) file analysis
in hybrid mode:

1. Volume mounting failure: util.SourceMountPath included the filename
   (e.g., /opt/input/source/app.war) which broke volume mounting.
   Solution: Temporarily adjust to parent directory for volume mount,
   then restore original path for provider configuration.

2. 20-second timeout error: Code attempted to wait for Maven target
   directory on host filesystem, but decompilation happens inside the
   container. Solution: Skip WalkJavaPathForTarget() for binary files
   in hybrid mode since decompilation and target directory handling
   occurs in the provider container.

Changes:
- Adjust SourceMountPath for volume mounting then restore it
- Skip target directory walk for binary files in two locations:
  - setupBuiltinProviderHybrid()
  - RunAnalysisHybridInProcess()
- Add explanatory comments about container decompilation

Tested with:
  kantra analyze --input app.war --output ./output --target quarkus --run-local=false

Results: Analysis completes successfully in ~54s with no errors.
Signed-off-by: tsanders <[email protected]>
Consolidated 5 separate hybrid mode documentation files into a single
comprehensive HYBRID_MODE.md:

Removed (merged into HYBRID_MODE.md):
- HYBRID_ARCHITECTURE_PLAN.md
- HYBRID_INPROCESS_IMPLEMENTATION.md
- HYBRID_MODE_QUICKSTART.md
- HYBRID_MODE_STRATEGY.md

Removed (historical/obsolete):
- POC_RESULTS.md
- RESEARCH_FINDINGS.md

New HYBRID_MODE.md includes:
- Overview and architecture diagrams
- Quick start guide with examples
- Three-mode strategy (containerless, hybrid, container)
- Troubleshooting section
- Performance benchmarks
- Known limitations (updated to reflect binary support)
- Implementation file references

Updated to document binary analysis support in hybrid mode.

Signed-off-by: tsanders <[email protected]>
Added demo-screencast.sh to demonstrate and compare different analysis
modes with both source and binary inputs.

Features:
- Tests 4 scenarios (2 modes × 2 input types):
  * Containerless mode (source + binary)
  * Hybrid mode (source + binary)
- Auto-builds test WAR file if not present
- Uses local output directory to avoid macOS /tmp limitations
- Validates kantra binary exists before running
- Color-coded output for clarity
- Performance timing for each test
- Graceful error handling with informative messages

Output structure:
  ./demo-output/
  ├── containerless-source/
  ├── containerless-binary/
  ├── hybrid-source/
  └── hybrid-binary/

Prerequisites documented in script header:
  1. Build kantra: go build -o ./bin/kantra
  2. Have Podman/Docker installed
  3. Maven installed (for building test WAR)

Also updated .gitignore to exclude demo-output/ directory.

Signed-off-by: tsanders <[email protected]>
Removed 22 obsolete test-data files:
- test-data/analysis-output.yaml
- test-data/deps-output.yaml
- test-data/asset_generation/discover/* (5 files)
- test-data/asset_generation/helm/* (15 files)

These were old analysis outputs and test fixtures no longer needed.

Signed-off-by: tsanders <[email protected]>
Restored 19 test data files (5 Cloud Foundry + 14 Helm) that were
removed in commit 87bf24e. These files are required by asset_generation
discovery tests which were failing without them.

Signed-off-by: tsanders <[email protected]>
- Remove BENCHMARK_RESULTS.md (contradicted HYBRID_MODE.md performance claims)
- Remove poc-start-provider.sh (development POC artifact)
- Fix hardcoded path in benchmark-modes.sh (use relative path)

Signed-off-by: tsanders <[email protected]>
- Correct README.md to show containerless as default (not hybrid)
- Remove broken link to non-existent HYBRID_MODE_QUICKSTART.md
- Update docs/containerless.md to use 'hybrid mode' terminology
- Clarify --run-local flag behavior in help text

Signed-off-by: tsanders <[email protected]>
These scripts are useful for local development but don't need to be tracked in git

Signed-off-by: tsanders <[email protected]>
- Move HYBRID_MODE.md to docs/hybrid.md for consistency with docs/containerless.md
- Update README.md references to point to new location
- Add hybrid mode to References section

Signed-off-by: tsanders <[email protected]>
RunAnalysisHybridInProcess was calling os.Exit(1) on several error paths,
which short-circuits the deferred cleanup (analyzeCmd.CleanAnalysisResources)
and leaves provider containers and temp directories running.

Replace os.Exit(1) with return fmt.Errorf(...) to propagate errors back to
RunE so the standard teardown executes properly.

Fixes: Lines 359, 370, 451, 475 in cmd/analyze-hybrid.go
Signed-off-by: tsanders <[email protected]>
Corrected multiple places where documentation incorrectly stated
hybrid mode was the default:

- Line 5: Clarified containerless is default, hybrid needs --run-local=false
- Table: Reordered to show Containerless (default) first
- Table: Fixed commands to show correct flags for each mode
- Examples: Changed 'Hybrid mode is default' to 'Default (containerless mode)'
- Examples: Added --run-local=false to all hybrid mode commands

Fixes coderabbitai feedback on PR konveyor#592

Signed-off-by: tsanders <[email protected]>
Changed 'Recommended: Default for macOS' to 'Recommended: Default choice
for macOS' to improve clarity and avoid confusion with the actual default
mode (containerless).

Addresses coderabbitai feedback on PR konveyor#592

Signed-off-by: tsanders <[email protected]>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
cmd/analyze-hybrid.go (2)

249-251: Fix Maven settings path for containerized providers.

The container cannot resolve the host filesystem path in a.mavenSettingsFile. Inside the container, the Maven settings file lives at the mounted path. This will cause provider initialization to fail when --maven-settings is used.

Apply this diff to use the container path:

 		providerSpecificConfig["depOpenSourceLabelsFile"] = "/usr/local/etc/maven.default.index"
 		if a.mavenSettingsFile != "" {
-			providerSpecificConfig["mavenSettingsFile"] = a.mavenSettingsFile
+			providerSpecificConfig["mavenSettingsFile"] = path.Join(util.ConfigMountPath, "settings.xml")
 		}

771-813: Check rule loading errors to prevent silent failures.

The result.err field is logged (line 790) but never checked during result collection (lines 807-812). This allows analysis to proceed even if all rulesets fail to load, potentially running with zero rules. Compare this to the startup task error handling at lines 574-577 where errors are properly validated.

Add error checking in the result collection loop:

+	var loadErrors []error
 	// Collect and merge results
 	for result := range resultChan {
+		if result.err != nil {
+			loadErrors = append(loadErrors, fmt.Errorf("%s: %w", result.rulePath, result.err))
+			continue
+		}
 		ruleSets = append(ruleSets, result.ruleSets...)
 		for k, v := range result.providers {
 			needProviders[k] = v
 		}
 	}
+
+	// Fail if no rulesets loaded successfully
+	if len(ruleSets) == 0 {
+		if len(loadErrors) > 0 {
+			return fmt.Errorf("failed to load any rulesets: %v", loadErrors)
+		}
+		return fmt.Errorf("no rulesets loaded - check rule paths and formats")
+	}
+	if len(loadErrors) > 0 {
+		operationalLog.Info("some rulesets failed to load but continuing with successful ones", "failures", len(loadErrors))
+	}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fcf0f2d and 1d3e9e8.

📒 Files selected for processing (2)
  • cmd/analyze-hybrid.go (22 hunks)
  • docs/hybrid.md (3 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-08T15:06:52.056Z
Learnt from: abrugaro
Repo: konveyor/kantra PR: 524
File: cmd/dump-rules.go:0-0
Timestamp: 2025-08-08T15:06:52.056Z
Learning: In konveyor/kantra cmd/dump-rules.go (Go/Cobra), the dump-rules command should not use package-level variables for flags (output, overwrite). Flags must be defined as local variables inside NewDumpRulesCommand to avoid shared mutable state across commands/tests.

Applied to files:

  • cmd/analyze-hybrid.go
🧬 Code graph analysis (1)
cmd/analyze-hybrid.go (3)
pkg/provider/java.go (1)
  • WalkJavaPathForTarget (67-108)
pkg/util/util.go (5)
  • JavaArchive (82-82)
  • WebArchive (83-83)
  • EnterpriseArchive (84-84)
  • ClassFile (85-85)
  • SourceMountPath (31-31)
cmd/analyze-bin.go (1)
  • ConsoleHook (43-46)
🪛 LanguageTool
docs/hybrid.md

[style] ~306-~306: The double modal “required Recommended” is nonstandard (only accepted in certain dialects). Consider “to be Recommended”.
Context: ... No local LSP installation required - Recommended for: macOS and production use --- *...

(NEEDS_FIXED)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build & test from commit
🔇 Additional comments (9)
docs/hybrid.md (1)

219-219: LGTM! Documentation accurately reflects implementation.

The documentation updates correctly describe the new TCP port polling with exponential backoff, mark completed TODOs, and fix the grammar issue. These changes align well with the implementation in cmd/analyze-hybrid.go.

Also applies to: 247-248, 306-306

cmd/analyze-hybrid.go (8)

35-88: LGTM! Excellent defensive validation.

Early validation of Maven settings and input paths with helpful error messages prevents confusing failures later. This improves the user experience significantly.


175-222: LGTM! Robust health check implementation.

The TCP polling with exponential backoff (50ms to 500ms) and context cancellation handling is well-designed. The timeout error includes helpful troubleshooting steps.


347-406: LGTM! Correct handling of binary vs source analysis.

The conditional logic properly skips target directory walking for binary inputs since decompilation occurs in the container. The dynamic Java exclude paths are correctly applied to the builtin provider.


620-643: LGTM! Efficient parallel health checks.

Running health checks concurrently for all providers reduces startup time. The goroutine variable capture is correct, and error handling properly fails fast if any provider fails to become ready.


670-685: LGTM! Proper cleanup on provider setup failure.

This correctly addresses the previous review comment by cleaning up all started providers and containers when setup fails. This prevents resource leaks and port conflicts on subsequent runs.


691-716: LGTM! Container path translation properly handles nested paths.

This correctly addresses the previous review comment by using strings.HasPrefix to handle nested container paths (e.g., /opt/input/source/java-project-1234), not just exact matches. The file input case is also properly handled using the parent directory.


837-920: LGTM! Comprehensive progress reporting implementation.

The progress bar implementation is well-designed with:

  • Conditional logger to avoid interference (lines 425-430)
  • Channel-based event consumption (lines 850-905)
  • Cumulative progress tracking across rulesets (lines 854-898)
  • Clean stage transitions and completion messages

The integration with RunRulesWithOptions is clean and the goroutine cleanup with progressCancel() and <-progressDone is correct.


941-986: LGTM! Clean output generation and results summary.

The output writing and static report generation are well-structured. The conditional results summary (lines 976-982) only in progress mode maintains clean console output. The timing instrumentation throughout the function provides good observability.

Replace all os.Exit(1) calls in RunAnalysisContainerless with proper
error returns. This ensures the deferred cursor restoration (\033[?25h)
always executes, preventing the user's terminal from being left with
a hidden cursor on errors.

Changes:
- Label selector errors: return error instead of os.Exit(1)
- Dependency label selector errors: return error instead of os.Exit(1)
- setBinMapContainerless errors: return error instead of os.Exit(1)
- Java provider setup errors: return error instead of os.Exit(1)
- Builtin provider setup errors: return error instead of os.Exit(1)
- output.yaml write errors: return error instead of os.Exit(1)
- setInternalProviders: changed signature to return error
- startProvidersContainerless: return error instead of os.Exit(1)

All error paths now properly propagate errors up the call stack, allowing
defer statements to execute and restore terminal state.

Addresses CodeRabbit review feedback about cursor restoration on fatal
exit paths.

Signed-off-by: tsanders <[email protected]>
Add error validation during rule collection to prevent analysis from
proceeding when all rulesets fail to load. Previously, rule loading
errors were logged but never checked in the collection loop, allowing
analysis to continue with zero rules.

Changes:
- Track rule loading errors in ruleLoadErrors slice
- Skip failed rulesets in the collection loop
- Fail fast if no rulesets loaded successfully
- Log warnings for partial failures (some rulesets loaded)
- Provide detailed error context including which rulesets failed

This ensures analysis fails early with clear error messages rather than
proceeding silently with no rules, which would produce empty results and
confuse users.

Matches error handling pattern used elsewhere in the file (lines 575-576,
640-641) where errors are validated and cause early returns.

Addresses CodeRabbit critical review feedback.

Signed-off-by: tsanders <[email protected]>
Add volume cleanup when startup tasks fail in hybrid mode. If volume
creation succeeds but a later startup task (e.g., config validation or
ruleset extraction) fails, the volume is now properly cleaned up before
returning the error.

Without this fix:
- Volume created successfully
- Later task fails (e.g., invalid Maven settings)
- Function returns immediately
- Volume left orphaned in container system
- Multiple failed runs accumulate volumes

With this fix:
- Check if volName is set (volume was created)
- Run 'podman volume rm' to clean up
- Log cleanup errors but don't mask original failure
- Return original error to user

This prevents resource accumulation over multiple failed analysis runs
and matches the cleanup pattern used for provider containers.

Addresses CodeRabbit review feedback about partial resource cleanup.

Signed-off-by: tsanders <[email protected]>
Fix test compilation errors by adding the missing timeout parameter
to all WaitForTargetDir() calls. The function signature was updated
to include a time.Duration timeout parameter, but the tests were not
updated accordingly.

Changes:
- TestWaitForTargetDir_ExistingTarget: Add 5s timeout
- TestWaitForTargetDir_Timeout: Add 100ms timeout (tests timeout behavior)
- TestWaitForTargetDir_CreateTargetDuringWait: Add 5s timeout
- TestWaitForTargetDir_WithNonexistentPath: Add 100ms timeout
- TestWaitForTargetDir_InvalidPath: Add 100ms timeout

All tests now pass successfully.

Fixes CI test failures in pkg/provider.

Signed-off-by: tsanders <[email protected]>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
pkg/provider/java_test.go (1)

390-396: Consider asserting that an error is returned.

The test only logs the error without asserting its presence. Based on the WaitForTargetDir implementation, calling watcher.Add() on a nonexistent path should return an error. Consider adding an assertion to make the test more robust.

Apply this diff to improve the test:

 func TestWaitForTargetDir_WithNonexistentPath(t *testing.T) {
 	logger := getTestLogger()
 	err := WaitForTargetDir(logger, "/nonexistent/path", 100*time.Millisecond)
-	// The initial Stat will fail, but it depends on the implementation
-	// This test is mostly to ensure we don't panic
-	t.Logf("Error for nonexistent path: %v", err)
+	// Should return an error when trying to watch a nonexistent path
+	require.Error(t, err)
 }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6ecabf5 and b292dcc.

📒 Files selected for processing (1)
  • pkg/provider/java_test.go (4 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
pkg/provider/java_test.go (1)
pkg/provider/java.go (1)
  • WaitForTargetDir (129-174)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build & test from commit
🔇 Additional comments (1)
pkg/provider/java_test.go (1)

348-360: LGTM! Timeout parameters are appropriate.

The timeout values are well-chosen for each test scenario:

  • 5 seconds for tests expecting successful completion (existing target, delayed creation)
  • 100 milliseconds for tests expecting failures or timeouts

The test logic correctly validates the updated WaitForTargetDir API with explicit timeout handling.

Also applies to: 362-371, 373-388, 475-486

@codecov
Copy link

codecov bot commented Nov 13, 2025

Codecov Report

❌ Patch coverage is 7.15884% with 415 lines in your changes missing coverage. Please review.
✅ Project coverage is 31.07%. Comparing base (e07ec59) to head (318060f).
⚠️ Report is 4 commits behind head on main.

Files with missing lines Patch % Lines
cmd/analyze-hybrid.go 5.43% 174 Missing ⚠️
cmd/analyze-bin.go 0.00% 109 Missing ⚠️
cmd/analyze.go 1.72% 57 Missing ⚠️
pkg/provider/java.go 41.17% 28 Missing and 2 partials ⚠️
cmd/progress_mode.go 0.00% 24 Missing ⚠️
cmd/command-context.go 0.00% 19 Missing ⚠️
cmd/static-report.go 0.00% 1 Missing ⚠️
pkg/container/container.go 0.00% 1 Missing ⚠️

❌ Your patch status has failed because the patch coverage (7.15%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #593      +/-   ##
==========================================
- Coverage   33.21%   31.07%   -2.14%     
==========================================
  Files          33       34       +1     
  Lines        3947     4257     +310     
==========================================
+ Hits         1311     1323      +12     
- Misses       2490     2787     +297     
- Partials      146      147       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@tsanders-rh tsanders-rh force-pushed the feature/hybrid-with-progress-bar branch from 8346214 to afdcc08 Compare November 14, 2025 00:29
Maven settings file and input path validation were duplicated in
validateProviderConfig(). These are already validated in the main
Validate() function called via PreRunE, so they don't need to be
re-validated in the hybrid-mode-specific function.

This keeps validateProviderConfig focused on hybrid-mode-specific
concerns: override provider settings and port availability checks.

Signed-off-by: tsanders <[email protected]>
Changed the results summary from "Logs:" to "Analysis logs:" to be
more specific. This is important because providers can run externally
(e.g., in containers in hybrid mode), which would have their own
separate logs. Being explicit about "Analysis logs" helps users
understand these are the analyzer's logs specifically.

Signed-off-by: tsanders <[email protected]>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
cmd/analyze-hybrid.go (1)

227-228: Consider using build args for LSP paths like Maven settings does.

The Maven settings file correctly uses util.ConfigMountPath from build args (Line 232), but the LSP server path and bundle path are hardcoded. These paths may differ in downstream builds.

Consider defining these paths in cmd/settings.go alongside JavaProviderImage:

// In cmd/settings.go Config struct:
JavaLspServerPath string `env:"JAVA_LSP_SERVER_PATH" default:"/jdtls/bin/jdtls"`
JavaBundlePath    string `env:"JAVA_BUNDLE_PATH" default:"/jdtls/java-analyzer-bundle/java-analyzer-bundle.core/target/java-analyzer-bundle.core-1.0.0-SNAPSHOT.jar"`

Then reference them here:

-providerSpecificConfig["lspServerPath"] = "/jdtls/bin/jdtls"
-providerSpecificConfig["bundles"] = "/jdtls/java-analyzer-bundle/java-analyzer-bundle.core/target/java-analyzer-bundle.core-1.0.0-SNAPSHOT.jar"
+providerSpecificConfig["lspServerPath"] = Settings.JavaLspServerPath
+providerSpecificConfig["bundles"] = Settings.JavaBundlePath

Based on learnings

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b292dcc and 5eb4d21.

📒 Files selected for processing (1)
  • cmd/analyze-hybrid.go (21 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-08T15:06:52.056Z
Learnt from: abrugaro
Repo: konveyor/kantra PR: 524
File: cmd/dump-rules.go:0-0
Timestamp: 2025-08-08T15:06:52.056Z
Learning: In konveyor/kantra cmd/dump-rules.go (Go/Cobra), the dump-rules command should not use package-level variables for flags (output, overwrite). Flags must be defined as local variables inside NewDumpRulesCommand to avoid shared mutable state across commands/tests.

Applied to files:

  • cmd/analyze-hybrid.go
🧬 Code graph analysis (1)
cmd/analyze-hybrid.go (4)
pkg/util/util.go (6)
  • ConfigMountPath (33-33)
  • JavaArchive (82-82)
  • WebArchive (83-83)
  • EnterpriseArchive (84-84)
  • ClassFile (85-85)
  • SourceMountPath (31-31)
pkg/provider/java.go (1)
  • WalkJavaPathForTarget (67-108)
cmd/settings.go (2)
  • Config (26-35)
  • Settings (24-24)
cmd/analyze-bin.go (1)
  • ConsoleHook (43-46)
🔇 Additional comments (7)
cmd/analyze-hybrid.go (7)

156-203: LGTM! Health check implementation is robust.

The exponential backoff strategy (50ms to 500ms) with proper context cancellation handling and detailed error messages is well-designed for local container health checks.


332-344: LGTM! Binary input handling is correct.

The logic correctly skips target directory walking for binary files since decompilation happens in the container-based Java provider. The builtin provider only needs to exclude target directories for source code analysis.


508-572: LGTM! Parallel startup tasks with proper cleanup.

The parallel execution of config validation, volume creation, and ruleset extraction is well-implemented. The cleanup logic (lines 559-564) correctly removes the volume on any startup failure. Since providers haven't started yet at this point, volume cleanup is the only resource that needs cleanup.


606-636: LGTM! Parallel health checks improve startup performance.

The parallel health check implementation correctly captures loop variables and uses a fail-fast approach. The 30-second timeout per provider is appropriate for containerized providers on localhost.


659-674: LGTM! Provider cleanup properly prevents resource leaks.

The cleanup logic correctly stops all previously-started providers (line 667) and removes their containers (line 670) when setupNetworkProvider fails. This addresses the past review comment about preventing leftover containers from blocking subsequent runs.


680-705: LGTM! Container-to-host path translation is correct.

The path translation logic correctly handles both directory and file input cases by using prefix matching instead of exact matching. This ensures nested container paths (e.g., /opt/input/source/java-project-1234) are properly translated to host paths so the builtin provider can find the files. This addresses the critical past review comment.


756-824: LGTM! Rule loading error handling prevents silent failures.

The parallel rule loading implementation correctly validates that at least one ruleset loaded successfully (lines 808-814) and logs warnings for partial failures (lines 816-821). This addresses the critical past review comment about preventing analysis from proceeding with zero rules.

The error collection and validation ensures that if all rulesets fail to load, the analysis fails with a clear error message.

Replace hardcoded JDTLS paths with JDTLSBinLocation and
JavaBundlesLocation constants. These file names can change across
different builds, so using build-time constants ensures the paths
remain correct.

This aligns with how other parts of the codebase (e.g., settings.go)
handle these paths.

Signed-off-by: tsanders <[email protected]>
The progress reporter setup logic was duplicated in both analyze-bin.go
and analyze-hybrid.go (~77 lines each). This extracts it into a shared
setupProgressReporter() function in analyze.go.

Benefits:
- Eliminates 148 lines of duplicated code
- Single source of truth for progress reporter logic
- Easier to maintain and update progress bar behavior
- Both containerless and hybrid modes use identical logic

The function returns the reporter, done channel, and cancel function,
making it easy to use in both locations.

Signed-off-by: tsanders <[email protected]>
The parallel startup task logic in RunAnalysisHybridInProcess was
extracted into its own function for better code organization.

Benefits:
- Clearer separation of concerns
- Easier to test startup logic independently
- More maintainable and readable main function
- Self-documenting function name describes what it does

The function runs three independent tasks concurrently:
1. Validate provider configuration
2. Create container volume
3. Extract default rulesets (if enabled)

Returns volume name and rulesets directory on success, with automatic
cleanup on failure.

Signed-off-by: tsanders <[email protected]>
@tsanders-rh
Copy link
Contributor Author

@eemcmullan - Thanks for review. All updated per comments. Waiting on CI, but ready for you to recheck. Thanks!

Copy link
Contributor

@shawn-hurley shawn-hurley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we consider adding a some type that will encapsulate all of the noProgress checks?

var ruleWg sync.WaitGroup
resultChan := make(chan ruleLoadResult, len(a.rules))

// Load each ruleset in parallel
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this is something that we should consider adding the analyzer-lsp library, this could be helpful for all the integrators (addon/hub, kai, kantra). I don't think we should stop it from going here, but just wanted to call that out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tsanders-rh
Copy link
Contributor Author

tsanders-rh commented Nov 16, 2025

Thanks for the feedback, @shawn-hurley! I've addressed both of your comments:

1. Encapsulating noProgress checks

I've created a ProgressMode type in cmd/progress_mode.go that encapsulates all the noProgress logic. This provides better abstraction and makes the code more maintainable.

Key methods:

  • OperationalLogger(log) - Returns the logger or logr.Discard() based on progress mode
  • ShouldAddConsoleHook() - Determines if console hooks should be added
  • HideCursor() / ShowCursor() - Terminal cursor control
  • Printf() / Println() - Conditional output to stderr
  • IsEnabled() / IsDisabled() - Query progress state

Before:

var operationalLog logr.Logger
if a.noProgress {
    operationalLog = a.log
} else {
    operationalLog = logr.Discard()
}

if !a.noProgress {
    fmt.Fprintf(os.Stderr, "\033[?25l")
    defer fmt.Fprintf(os.Stderr, "\033[?25h")
}

After:

progressMode := NewProgressMode(a.noProgress)
operationalLog := progressMode.OperationalLogger(a.log)

progressMode.HideCursor()
defer progressMode.ShowCursor()

This refactoring affected:

  • cmd/analyze-bin.go: 11 instances replaced
  • cmd/analyze-hybrid.go: 11 instances replaced

2. Logging discarded output with verbosity control

I've added helper methods in cmd/command-context.go to conditionally capture and log container command output based on verbosity level:

  • setupCommandOutput(cmd) - Configures stdout/stderr (discard or capture)
  • logCommandOutput(cmdName, stdout, stderr) - Logs captured output

Implementation:

func (c *AnalyzeCommandContext) setupCommandOutput(cmd *exec.Cmd) (stdout, stderr *bytes.Buffer) {
    if c.log.V(2).Enabled() {
        // Capture output for verbose logging
        stdout = &bytes.Buffer{}
        stderr = &bytes.Buffer{}
        cmd.Stdout = stdout
        cmd.Stderr = stderr
        return stdout, stderr
    }
    // Discard output to keep console clean
    cmd.Stdout = io.Discard
    cmd.Stderr = io.Discard
    return nil, nil
}

Usage:

cmd := exec.Command(Settings.ContainerBinary, args...)
stdout, stderr := c.setupCommandOutput(cmd)
err := cmd.Run()
c.logCommandOutput("network create", stdout, stderr)

Behavior:

  • Normal mode: Output discarded (clean console)
  • Verbose mode (-v 2): Output captured and logged for debugging

Applied to:

  • createContainerNetwork()
  • createContainerVolume()

All tests pass and the changes maintain backward compatibility.

Created a new ProgressMode type to consolidate all noProgress-related
checks and improve code maintainability. This addresses PR feedback to
add better abstraction for progress reporting behavior.

Changes:
- Add ProgressMode type with methods for cursor control, conditional
  logging, and output management
- Replace 22 instances of noProgress checks across analyze-bin.go and
  analyze-hybrid.go with ProgressMode methods
- Add verbosity-based command output logging in command-context.go
  to conditionally capture container command output at V(2) level

The refactoring maintains backward compatibility and all tests pass.

Signed-off-by: tsanders <[email protected]>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cmd/analyze-bin.go (1)

893-944: Replace log.Fatalln in static report path to avoid leaving cursor hidden and hard exits

buildStaticReportFile still uses log.Fatalln on error (Lines 930, 935, 940). Because RunAnalysisContainerless now hides the cursor and relies on a deferred ShowCursor(), any log.Fatalln in this call stack will:

  • Call os.Exit(1) and bypass the defer, leaving the user’s terminal cursor hidden.
  • Bypass the normal error return path and be inconsistent with the rest of the function, which now returns errors instead of exiting.

This is a user‑visible regression and should be fixed.

You can preserve behavior by converting these to normal error returns and logging through a.log (or operationalLog at the caller) instead of using the global log package. For example:

 func (a *analyzeCommand) buildStaticReportFile(ctx context.Context, staticReportPath string, depsErr bool) error {
@@
-	apps, err := validateFlags(outputAnalyses, applicationNames, outputDeps, a.log)
-	if err != nil {
-		log.Fatalln("failed to validate flags", err)
-	}
+	apps, err := validateFlags(outputAnalyses, applicationNames, outputDeps, a.log)
+	if err != nil {
+		a.log.Error(err, "failed to validate flags for static report")
+		return fmt.Errorf("failed to validate flags for static report: %w", err)
+	}
@@
-	err = loadApplications(apps)
-	if err != nil {
-		log.Fatalln("failed to load report data from analysis output", err)
-	}
+	if err := loadApplications(apps); err != nil {
+		a.log.Error(err, "failed to load report data from analysis output")
+		return fmt.Errorf("failed to load report data from analysis output: %w", err)
+	}
@@
-	err = generateJSBundle(apps, outputJSPath, a.log)
-	if err != nil {
-		log.Fatalln("failed to generate output.js file from template", err)
-	}
-
-	return nil
+	if err := generateJSBundle(apps, outputJSPath, a.log); err != nil {
+		a.log.Error(err, "failed to generate output.js file from template")
+		return fmt.Errorf("failed to generate output.js file from template: %w", err)
+	}
+
+	return nil
 }

With these changes, GenerateStaticReportContainerless can propagate errors normally, ensuring the deferred ShowCursor() always runs and the CLI remains consistent with the rest of the codebase’s error‑handling approach.

(Optionally, you may also want to either write something meaningful to static-report.log in GenerateStaticReportContainerless or drop the unused log file entirely.)

Also applies to: 958-997

♻️ Duplicate comments (1)
cmd/analyze.go (1)

209-227: Avoid rerunning JSON/static-report generation after RunAnalysisHybridInProcess

In the hybrid branch, RunAnalysisHybridInProcess already:

  • writes output.yaml
  • calls CreateJSONOutput() when --json-output is set
  • calls GenerateStaticReport(ctx, operationalLog)

The subsequent block in RunE:

err = analyzeCmd.CreateJSONOutput()
...
err = analyzeCmd.GenerateStaticReport(ctx, operationalLog)

reruns both JSON and static-report generation, doubling the work with no user benefit and extending total runtime, especially for larger reports. This is the same concern raised earlier and still present.

You can safely rely on RunAnalysisHybridInProcess to handle these steps and delete the redundant block:

-			err = analyzeCmd.CreateJSONOutput()
-			if err != nil {
-				log.Error(err, "failed to create json output file")
-				return err
-			}
-
-			// Create conditional logger for static report generation
-			var operationalLog logr.Logger
-			if analyzeCmd.noProgress {
-				operationalLog = log
-			} else {
-				operationalLog = logr.Discard()
-			}
-			err = analyzeCmd.GenerateStaticReport(ctx, operationalLog)
-			if err != nil {
-				log.Error(err, "failed to generate static report")
-				return err
-			}
-
-			return nil
+			return nil

This preserves behavior for both progress and --no-progress modes while ensuring the generator runs only once per analysis.

Also applies to: 282-293

🧹 Nitpick comments (8)
cmd/progress_mode.go (1)

10-75: ProgressMode encapsulation looks solid; consider minor ergonomics improvements

The enabled/disabled gating, logger selection, and cursor/print behavior all align with the progress UX you’re building. No correctness issues spotted.

If you ever want easier testing or to avoid escape sequences when stderr isn’t a TTY, consider letting ProgressMode accept an io.Writer (or injecting a small isTTY check) instead of hard‑coding os.Stderr. That’s purely a nice‑to‑have.

cmd/analyze-bin.go (2)

63-81: Progress bar rendering logic is correct; message truncation is slightly byte-oriented

The bar computation, clamping, and in‑place update logic (carriage return + clear to EOL) all look good and should behave well in typical terminals.

One minor nit: len(message) truncation is byte‑based, so very long non‑ASCII rule names could get chopped mid‑rune. If you expect non‑ASCII rule IDs, using []rune(message) for truncation would avoid that. Otherwise this is fine.


676-807: Provider setup helpers are consistent with the new operational logger pattern

The refactors to:

  • setBuiltinProvider / setJavaProvider
  • setupJavaProvider
  • setupBuiltinProvider
  • setInternalProviders
  • startProvidersContainerless

all follow a consistent pattern:

  • Operational messages go through operationalLog, while errors are still logged on a.log.
  • ContextLines and CLI‑selected AnalysisMode are applied before client creation.
  • Errors are returned with context (wrapped via fmt.Errorf) instead of causing hard exits.
  • setInternalProviders now surfaces provider setup failures rather than silently swallowing them.

This improves observability and makes failures easier to handle upstream without introducing new correctness issues.

Also applies to: 809-841

cmd/analyze.go (1)

84-87: --no-progress flag wiring is correct and follows prior flag-ownership guidance

The new noProgress field on analyzeCommand and the corresponding --no-progress flag are correctly scoped to the command struct (not package-level globals), which avoids shared mutable state across commands/tests and aligns with earlier guidance for this repo. The flag description is clear and matches its usage in containerless and hybrid paths.

Based on learnings

Also applies to: 327-327

cmd/analyze-hybrid.go (4)

36-68: Hybrid-specific validation and provider overrides look reasonable; minor UX/override caveats

  • validateProviderConfig correctly validates --override-provider-settings and performs a best‑effort port availability check before starting providers.
  • The override loading/apply path (loadOverrideProviderSettings, applyProviderOverrides, mergeProviderSpecificConfig) cleanly overlays JSON overrides on top of generated configs for matching providers.

Two minor points:

  1. The troubleshooting text in the port‑in‑use error is hardcoded to podman commands; since Settings.ContainerBinary is configurable, users running with docker or another tool will see misleading instructions. Consider either making the message generic (“your container tool”) or building the commands from Settings.ContainerBinary.

  2. applyProviderOverrides only merges the first InitConfig entry. That matches today’s single‑config usage, but if you ever support multiple InitConfig entries per provider, you’ll probably want to extend the merge logic to handle them all.

Otherwise, the behavior is clear and scoped to hybrid mode as intended.

Also applies to: 70-151


326-388: Parallel startup tasks and volume cleanup behavior look correct

runParallelStartupTasks concurrently:

  • Validates provider config,
  • Creates the container volume, and
  • Extracts default rulesets (if enabled),

then aggregates results via a buffered channel.

On any failure it:

  • Removes the created volume (if any) via container volume rm, and
  • Returns a wrapped error indicating which task failed.

The concurrency is straightforward (WaitGroup + buffered channel) and avoids races on volName/rulesetsDir. If you later add more startup tasks or a dedicated volume cleanup helper, this function should still compose nicely.


390-451: Hybrid builtin provider setup is correct; excludedDirs always set (even when empty)

setupBuiltinProviderHybrid:

  • Skips WalkJavaPathForTarget for binary inputs (where decompilation happens in the container) and logs accordingly.
  • For source inputs, computes javaTargetPaths and passes them as excludedDirs so builtin doesn’t double-count Java target output.
  • Uses lib.GetProviderClient and initializes the builtin provider with optional additional configs.

One behavioral change vs containerless is that excludedDirs is always present in ProviderSpecificConfig, even when javaTargetPaths is empty. Assuming the builtin provider treats a nil/empty excludedDirs slice as “no exclusions,” this should be fine; if it doesn’t, you may want to only set the key when len(javaTargetPaths) > 0 to exactly mirror the containerless behavior.


466-918: Hybrid in-process analysis pipeline and progress integration look good end-to-end

RunAnalysisHybridInProcess now:

  • Uses ProgressMode to gate logs, hide/show the cursor, and print progress messages and a final results summary.
  • Detects binary inputs to adjust user-facing messages and correctly tweaks util.SourceMountPath only during volume creation, restoring it before client config.
  • Runs providers in containers via RunProvidersHostNetwork, then performs parallel health checks using waitForProvider with proper context handling.
  • Sets up network-based provider clients for all configured providers, transforms container paths in additional builtin configs back to host paths (with special handling for binary inputs), and then initializes the host-side builtin provider.
  • Parallelizes rule loading across rulesets, aggregates results, and fails fast if no rules successfully load while tolerating partial failures with clear logging.
  • Runs dependency analysis for full-mode Java runs, wires in the shared progress reporter from setupProgressReporter, and cleanly cancels/waits for the progress goroutine.
  • Sorts and writes results, conditionally emits JSON, and triggers containerized static-report generation with timing logs and a concise results summary.

The lifecycle of providers and resources (including provider.Stop, spans, and cleanup delegated back to CleanAnalysisResources) is coherent, and progress behavior mirrors the containerless path. I don’t see correctness issues in this flow; the main remaining duplication concern is the extra static-report/JSON call in RunE, which is addressed separately in cmd/analyze.go.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7969ab2 and 318060f.

📒 Files selected for processing (5)
  • cmd/analyze-bin.go (20 hunks)
  • cmd/analyze-hybrid.go (20 hunks)
  • cmd/analyze.go (10 hunks)
  • cmd/command-context.go (4 hunks)
  • cmd/progress_mode.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • cmd/command-context.go
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-08T15:06:52.056Z
Learnt from: abrugaro
Repo: konveyor/kantra PR: 524
File: cmd/dump-rules.go:0-0
Timestamp: 2025-08-08T15:06:52.056Z
Learning: In konveyor/kantra cmd/dump-rules.go (Go/Cobra), the dump-rules command should not use package-level variables for flags (output, overwrite). Flags must be defined as local variables inside NewDumpRulesCommand to avoid shared mutable state across commands/tests.

Applied to files:

  • cmd/analyze-hybrid.go
🧬 Code graph analysis (2)
cmd/analyze-bin.go (3)
cmd/progress_mode.go (1)
  • NewProgressMode (18-20)
pkg/util/util.go (5)
  • JavaArchive (82-82)
  • WebArchive (83-83)
  • EnterpriseArchive (84-84)
  • ClassFile (85-85)
  • JavaProvider (72-72)
pkg/provider/java.go (2)
  • JavaProvider (18-20)
  • WalkJavaPathForTarget (67-108)
cmd/analyze-hybrid.go (5)
pkg/util/util.go (6)
  • ConfigMountPath (33-33)
  • JavaArchive (82-82)
  • WebArchive (83-83)
  • EnterpriseArchive (84-84)
  • ClassFile (85-85)
  • SourceMountPath (31-31)
cmd/settings.go (2)
  • Settings (24-24)
  • Config (26-35)
pkg/provider/java.go (1)
  • WalkJavaPathForTarget (67-108)
cmd/progress_mode.go (1)
  • NewProgressMode (18-20)
cmd/analyze-bin.go (1)
  • ConsoleHook (42-45)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build & test from commit
🔇 Additional comments (5)
cmd/analyze-bin.go (1)

83-379: Containerless progress integration and logging flow look consistent

The containerless pipeline now:

  • Wraps progress behavior via ProgressMode (cursor hide/show, conditional prints).
  • Uses OperationalLogger for timing and operational messages so they don’t interfere with the progress bar.
  • Detects binary vs source analysis for user‑facing messages.
  • Avoids the console hook when progress is enabled.
  • Wires the analyzer’s progress reporter via setupProgressReporter and correctly cancels and waits for the goroutine only when progress is enabled.

The nil handling for progressCancel (via ProgressMode.IsEnabled() checks) means you won’t dereference a nil cancel func when --no-progress is set, and the wait on progressDone only occurs when the channel is non‑nil.

Overall, the control flow and error propagation here look good.

cmd/analyze.go (2)

1118-1221: Static report containerization via GenerateStaticReport looks correct

GenerateStaticReport now:

  • Uses operationalLog for status messages so you can suppress them when progress is enabled.
  • Handles missing dependencies.yaml gracefully and logs the “source-only” case.
  • Builds bulk and single-app argument lists correctly, re-mapping host paths into container mount paths for bulk YAML outputs.
  • Runs js-bundle-generator in the runner image and copies the static-report assets into the output directory.
  • Logs the final file:// URL through the operational logger.

The volume mappings and argument construction look consistent with the rest of the containerized flows, and error paths sensibly return errors instead of exiting the process.


1494-1576: Progress reporter context cancellation is properly implemented—approval stands

Verification confirms the implementation is sound:

When NewChannelReporter receives context cancellation, it spawns a goroutine that monitors ctx.Done() and calls Close() to close the events channel. The Close() method safely closes the channel after setting the closed flag, ensuring the for-range loop in setupProgressReporter will terminate properly without deadlock or panic.

The shutdown semantics are correct as-is. The optional nil-check suggestion for progressCancel() call sites remains defensible but unnecessary given the existing alignment between noProgress and actual nil/non-nil values.

cmd/analyze-hybrid.go (2)

153-202: Provider health check with exponential backoff is sound

waitForProvider:

  • Polls localhost:port with a bounded exponential backoff (50ms–500ms).
  • Honors context cancellation on both the main polling loop and the sleep.
  • Returns a detailed troubleshooting error if the provider never becomes ready.

This is a solid improvement over fixed sleeps and should behave well even under transient startup delays.


204-325: Manual verification needed: Proxy field handling in external provider package

The external provider package (github.com/konveyor/analyzer-lsp/provider/lib) behavior cannot be verified from this repository. However, investigation reveals a potential concern:

The local codebase shows inconsistent Proxy assignment patterns:

  • cmd/analyze-bin.go sets Config.Proxy (top-level) at lines 608, 720, 756
  • cmd/analyze-hybrid.go setupNetworkProvider sets only InitConfig[0].Proxy (line 278), not Config.Proxy

The setupNetworkProvider function has a comment stating "Keep as pointer - InitConfig.Proxy is *Proxy!" suggesting awareness of the per-InitConfig field, but this differs from the top-level pattern used elsewhere.

To confirm this code is correct, verify with the analyzer-lsp provider package maintainers that:

  1. Both Config.Proxy and InitConfig.Proxy fields are honored when creating network clients
  2. If both are set, which takes precedence and is that intentional
  3. Whether setupNetworkProvider should also set the top-level Config.Proxy field

Copy link
Contributor

@shawn-hurley shawn-hurley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eemcmullan Will leave for you to merge when you are good to go!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants