-
Notifications
You must be signed in to change notification settings - Fork 4
add benchmark for large scale cycles #457
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughReplaces the DFS/coloring cycle detector with a new SCC-based pipeline, adds SCC utilities and adjacency/reachability helpers, threads context through cycle APIs and tests, and exposes component-based cycle computation and fixing with cancellation support used by the syncer. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor Syncer as Syncer / Tests
participant EG as EntitlementGraph
participant SCC as scc.CondenseFWBWGroupsFromAdj
participant Fixer as FixCyclesFromComponents
Syncer->>EG: ComputeCyclicComponents(ctx)
EG->>EG: toAdjacency(nodesSubset)
EG->>SCC: CondenseFWBWGroupsFromAdj(ctx, adj, opts)
SCC-->>EG: return components ([][]int)
alt no components
EG-->>Syncer: no cycles
else components found
Syncer->>EG: FixCyclesFromComponents(ctx, components)
EG->>Fixer: iterate components (check ctx.Done())
loop per component
Fixer->>EG: fixCycle(component)
alt ctx canceled
Fixer-->>Syncer: return ctx.Err()
end
end
Fixer-->>Syncer: return nil / error
end
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Suggested reviewers
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 💡 Knowledge Base configuration:
You can enable these sources in your CodeRabbit configuration. 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
pkg/sync/expand/cycle.go (1)
140-148: Bug: deleting the wrong adjacency map entry in removeNodeThe final delete targets SourcesToDestinations again, leaving DestinationsToSources[nodeID] behind. This can leak state and corrupt future traversals.
- delete(g.SourcesToDestinations, nodeID) + delete(g.DestinationsToSources, nodeID)
🧹 Nitpick comments (1)
pkg/sync/expand/graph_test.go (1)
131-144: Strengthen TestRemoveNode to catch adjacency-map leaksAdd assertions to ensure both adjacency indexes are cleared after removal.
func TestRemoveNode(t *testing.T) { ctx, cancel := context.WithCancel(context.Background()) defer cancel() graph := parseExpression(t, ctx, "1") node := graph.GetNode("1") require.NotNil(t, node) graph.removeNode(1) node = graph.GetNode("1") require.Nil(t, node) + // Ensure adjacency maps don't retain stale entries. + _, ok := graph.SourcesToDestinations[1] + require.False(t, ok, "SourcesToDestinations leaked entry for removed node") + _, ok = graph.DestinationsToSources[1] + require.False(t, ok, "DestinationsToSources leaked entry for removed node") }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (3)
pkg/sync/expand/cycle.go(2 hunks)pkg/sync/expand/graph_test.go(3 hunks)pkg/sync/syncer.go(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
pkg/sync/expand/cycle.go (1)
pkg/sync/expand/graph.go (1)
EntitlementGraph(41-53)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: go-test (1.23.x, windows-latest)
🔇 Additional comments (6)
pkg/sync/expand/graph_test.go (3)
204-208: LGTM: context plumbed into FixCyclesCall site updated correctly to pass ctx and assert no error.
224-228: LGTM: context-aware FixCycles in complex cycle testLooks correct and consistent with the new signature.
251-255: LGTM: context-aware FixCycles in clique testMatches the new API and expectations.
pkg/sync/syncer.go (1)
1376-1379: LGTM: propagate context to FixCyclesCorrectly forwards ctx; errors on cancellation will bubble up.
pkg/sync/expand/cycle.go (2)
4-5: LGTM: add context importRequired for the new FixCycles signature.
152-171: Iterative, cancellable FixCycles: good changeLoop + ctx check avoids recursion blowups and enables cancellation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
pkg/sync/expand/cycle.go (1)
94-103: Bug: wrong map deleted in removeNode.Second delete should target DestinationsToSources, not SourcesToDestinations (already deleted above). This can leave stale reverse entries and corrupt invariants.
// Delete all incoming edges. if sources, ok := g.DestinationsToSources[nodeID]; ok { for sourceID, edgeID := range sources { delete(g.SourcesToDestinations[sourceID], nodeID) delete(g.Edges, edgeID) } } - delete(g.SourcesToDestinations, nodeID) + delete(g.DestinationsToSources, nodeID)
🧹 Nitpick comments (11)
pkg/sync/expand/cycle_benchmark_test.go (2)
154-156: Context setup OK; minor benchmark hygiene.Consider moving ctx creation inside each sub-benchmark to avoid any risk of cross-sub-benchmark interaction, or just use context.Background() since you never cancel. Also add b.ReportAllocs() for allocation visibility.
func BenchmarkGetFirstCycle(b *testing.B) { - ctx, cancel := context.WithCancel(context.Background()) - defer cancel() + b.ReportAllocs() + ctx := context.Background()
163-164: Avoid dead-code elimination in benchmarks.Assign the result to a package-level sink.
- _ = g.GetFirstCycle(ctx) + benchCycleSink = g.GetFirstCycle(ctx)Add at file scope:
var benchCycleSink []intAlso applies to: 173-174
pkg/sync/expand/graph_test.go (1)
146-149: Add a cancellation test.Consider a test that cancels ctx mid-fix to ensure early exit surfaces ctx.Err().
pkg/sync/expand/scc/scc_test.go (2)
10-21: Helpers: mark as test helpers.Call t.Helper() in makeAdj/normalizeGroups/defaultOpts to improve failure reporting.
-func makeAdj(nodes []int, edges [][2]int) map[int]map[int]int { +func makeAdj(nodes []int, edges [][2]int) map[int]map[int]int { + testing.Helper()Apply similarly to normalizeGroups and defaultOpts.
220-239: Determinism test: nice.Covers ordering; consider adding a cancellation test (cancelled ctx before call) to assert timely return.
pkg/sync/expand/cycle.go (3)
32-47: Make cycleDetectionHelper context-aware.It hardcodes context.Background(); pass a ctx param so benches and callers can cancel.
-func (g *EntitlementGraph) cycleDetectionHelper( - nodeID int, -) ([]int, bool) { +func (g *EntitlementGraph) cycleDetectionHelper(ctx context.Context, nodeID int) ([]int, bool) { ... - groups := scc.CondenseFWBWGroupsFromAdj(context.Background(), adj, scc.DefaultOptions()) + groups := scc.CondenseFWBWGroupsFromAdj(ctx, adj, scc.DefaultOptions())And update call sites in benchmarks accordingly.
53-68: ComputeCyclicComponents(ctx): OK; consider deterministic mode for tests.Optional: set Deterministic=true in options for stable grouping when tests depend on group ordering.
- groups := scc.CondenseFWBWGroupsFromAdj(ctx, adj, scc.DefaultOptions()) + opts := scc.DefaultOptions() + opts.Deterministic = true + groups := scc.CondenseFWBWGroupsFromAdj(ctx, adj, opts)
130-238: Avoid creating edges that will be deleted immediately.Filter intra-component edges when wiring to reduce churn.
- for destinationID, resourceTypeIDs := range outgoingEdgesToResourceTypeIDs { + inCycle := make(map[int]struct{}, len(nodeIDs)) + for _, id := range nodeIDs { inCycle[id] = struct{}{} } + for destinationID, resourceTypeIDs := range outgoingEdgesToResourceTypeIDs { + if _, ok := inCycle[destinationID]; ok { + continue + } g.NextEdgeID++ ... } - for sourceID, resourceTypeIDs := range incomingEdgesToResourceTypeIDs { + for sourceID, resourceTypeIDs := range incomingEdgesToResourceTypeIDs { + if _, ok := inCycle[sourceID]; ok { + continue + } g.NextEdgeID++ ... }pkg/sync/expand/scc/scc.go (3)
503-524: Sequential BFS step has unsafe concurrent bitset operations.In the sequential BFS step (lines 503-524), you're directly modifying the visited bitset without atomic operations. While this is safe because you're in a single-threaded context here, the comment at line 513 acknowledges this. However, this creates a maintenance risk if someone later modifies the code without understanding this constraint.
Consider using atomic operations consistently, or at least add more prominent documentation about the safety assumptions:
for i := 0; i < len(frontier); i++ { u := frontier[i] rs, re := getRow(u) for p := rs; p < re; p++ { v := getCol(p) if !active.test(v) { continue } - // Non-atomic is safe because we are single-threaded on this step + // SAFETY: Non-atomic operations are safe here because we're in a + // single-threaded context (len(frontier) <= 64 or maxWorkers == 1). + // This is a performance optimization to avoid atomic overhead. w := v >> 6 mask := uint64(1) << (uint(v) & 63) if (visited.w[w] & mask) == 0 { visited.w[w] |= mask next = append(next, v) } } }
730-781: Tarjan implementation could overflow stack on deep recursion.The recursive Tarjan implementation could cause stack overflow on graphs with very deep paths. While this is only used for small graphs (N <= SmallCutoff), even small graphs can have deep paths.
Consider converting to an iterative implementation using an explicit stack to avoid stack overflow issues. However, given the SmallCutoff limit of 8192, this may be acceptable for most use cases.
47-68: Consider making EnableTrim default to true.The trimming optimization can significantly improve performance on sparse graphs with many source/sink nodes. Since it has minimal overhead, consider enabling it by default.
func DefaultOptions() Options { return Options{ MaxWorkers: runtime.GOMAXPROCS(0), - EnableTrim: false, + EnableTrim: true, SmallCutoff: 8192, Deterministic: false, } }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (7)
pkg/sync/expand/cycle.go(4 hunks)pkg/sync/expand/cycle_benchmark_test.go(2 hunks)pkg/sync/expand/graph.go(1 hunks)pkg/sync/expand/graph_test.go(7 hunks)pkg/sync/expand/scc/scc.go(1 hunks)pkg/sync/expand/scc/scc_test.go(1 hunks)pkg/sync/syncer.go(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- pkg/sync/syncer.go
🧰 Additional context used
🧬 Code graph analysis (2)
pkg/sync/expand/scc/scc_test.go (1)
pkg/sync/expand/scc/scc.go (3)
Options(48-58)DefaultOptions(61-68)CondenseFWBWGroupsFromAdj(317-366)
pkg/sync/expand/cycle.go (3)
pkg/sync/expand/graph.go (1)
EntitlementGraph(41-53)pkg/sync/expand/scc/scc.go (1)
CondenseFWBWGroupsFromAdj(317-366)vendor/github.com/deckarep/golang-set/v2/set.go (1)
NewThreadUnsafeSet(218-224)
🪛 GitHub Check: go-lint
pkg/sync/expand/scc/scc.go
[failure] 515-515:
G115: integer overflow conversion int -> uint (gosec)
[failure] 309-309:
G115: integer overflow conversion int -> uint (gosec)
[failure] 249-249:
G115: integer overflow conversion int -> uint (gosec)
[failure] 232-232:
G115: integer overflow conversion int -> uint (gosec)
[failure] 225-225:
G115: integer overflow conversion int -> uint (gosec)
[failure] 218-218:
G115: integer overflow conversion int -> uint (gosec)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Cursor Bugbot
- GitHub Check: go-test (1.23.x, windows-latest)
🔇 Additional comments (11)
pkg/sync/expand/graph.go (1)
348-372: BFS helper looks good.Nil-on-missing start is fine given current callers; no changes requested.
pkg/sync/expand/graph_test.go (1)
163-164: Context threaded through cycle APIs: LGTM.Good migration to GetFirstCycle(ctx)/FixCycles(ctx).
Also applies to: 192-193, 205-205, 209-209, 224-224, 233-234, 251-251, 260-261
pkg/sync/expand/cycle.go (3)
10-22: GetFirstCycle(ctx): LGTM.Early-return on HasNoCycles is a good fast path.
24-30: HasCycles(ctx): LGTM.Straightforward and cancellation-safe via underlying ComputeCyclicComponents.
104-125: FixCyclesFromComponents(ctx): LGTM.Honors cancellation and sets HasNoCycles at end.
pkg/sync/expand/scc/scc.go (6)
1-5: Package documentation looks good.The package comment clearly describes the purpose and functionality of the parallel FW-BW SCC condensation implementation.
86-183: CSR construction handles deterministic ordering well.The CSR construction correctly implements deterministic node ordering when requested, which is important for reproducible results in testing and debugging scenarios.
185-312: Bitset implementation has excellent concurrency documentation.The bitset implementation has comprehensive documentation about thread-safety guarantees and usage patterns. The atomic operations are correctly implemented with CAS loops.
629-726: trimSingletons implementation is correct and efficient.The singleton trimming optimization correctly maintains in/out degrees within the active set and properly updates them as vertices are peeled. The use of slice pools helps reduce allocations.
17-45: Sync.Pool usage for slice reuse is well implemented.The use of sync.Pool for int slices and bucket slices helps reduce GC pressure in the hot path. The implementation correctly handles capacity checks and slice reuse.
368-466: Parallel recursion in sccFWBW is well-structured.The parallel FW-BW implementation correctly partitions the graph into F\C, B\C, and U regions, with appropriate goroutine-based parallelization for independent subproblems. The context cancellation support is properly integrated.
| // toAdjacency builds an adjacency map for SCC. If nodesSubset is non-nil, only | ||
| // include those nodes (and edges between them). Always include all nodes in the | ||
| // subset as keys, even if they have zero outgoing edges. | ||
| func (g *EntitlementGraph) toAdjacency(nodesSubset map[int]struct{}) map[int]map[int]int { | ||
| adj := make(map[int]map[int]int, len(g.Nodes)) | ||
| include := func(id int) bool { | ||
| if nodesSubset == nil { | ||
| return true | ||
| } | ||
| _, ok := nodesSubset[id] | ||
| return ok | ||
| } | ||
|
|
||
| // Ensure keys for all included nodes. | ||
| for id := range g.Nodes { | ||
| if include(id) { | ||
| adj[id] = make(map[int]int) | ||
| } | ||
| } | ||
|
|
||
| // Add edges where both endpoints are included. | ||
| for src, dsts := range g.SourcesToDestinations { | ||
| if !include(src) { | ||
| continue | ||
| } | ||
| row := adj[src] | ||
| for dst := range dsts { | ||
| if include(dst) { | ||
| row[dst] = 1 | ||
| } | ||
| } | ||
| } | ||
| return adj | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Build adjacency in O(E_subset) when a subset is provided.
Current loop scans all edges even when nodesSubset is small. Iterate only sources in the subset.
- // Add edges where both endpoints are included.
- for src, dsts := range g.SourcesToDestinations {
- if !include(src) {
- continue
- }
- row := adj[src]
- for dst := range dsts {
- if include(dst) {
- row[dst] = 1
- }
- }
- }
+ // Add edges where both endpoints are included.
+ if nodesSubset == nil {
+ for src, dsts := range g.SourcesToDestinations {
+ row := adj[src]
+ for dst := range dsts {
+ row[dst] = 1
+ }
+ }
+ } else {
+ for src := range nodesSubset {
+ dsts, ok := g.SourcesToDestinations[src]
+ if !ok {
+ continue
+ }
+ row := adj[src]
+ for dst := range dsts {
+ if _, ok := nodesSubset[dst]; ok {
+ row[dst] = 1
+ }
+ }
+ }
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // toAdjacency builds an adjacency map for SCC. If nodesSubset is non-nil, only | |
| // include those nodes (and edges between them). Always include all nodes in the | |
| // subset as keys, even if they have zero outgoing edges. | |
| func (g *EntitlementGraph) toAdjacency(nodesSubset map[int]struct{}) map[int]map[int]int { | |
| adj := make(map[int]map[int]int, len(g.Nodes)) | |
| include := func(id int) bool { | |
| if nodesSubset == nil { | |
| return true | |
| } | |
| _, ok := nodesSubset[id] | |
| return ok | |
| } | |
| // Ensure keys for all included nodes. | |
| for id := range g.Nodes { | |
| if include(id) { | |
| adj[id] = make(map[int]int) | |
| } | |
| } | |
| // Add edges where both endpoints are included. | |
| for src, dsts := range g.SourcesToDestinations { | |
| if !include(src) { | |
| continue | |
| } | |
| row := adj[src] | |
| for dst := range dsts { | |
| if include(dst) { | |
| row[dst] = 1 | |
| } | |
| } | |
| } | |
| return adj | |
| } | |
| func (g *EntitlementGraph) toAdjacency(nodesSubset map[int]struct{}) map[int]map[int]int { | |
| adj := make(map[int]map[int]int, len(g.Nodes)) | |
| include := func(id int) bool { | |
| if nodesSubset == nil { | |
| return true | |
| } | |
| _, ok := nodesSubset[id] | |
| return ok | |
| } | |
| // Ensure keys for all included nodes. | |
| for id := range g.Nodes { | |
| if include(id) { | |
| adj[id] = make(map[int]int) | |
| } | |
| } | |
| // Add edges where both endpoints are included. | |
| if nodesSubset == nil { | |
| for src, dsts := range g.SourcesToDestinations { | |
| row := adj[src] | |
| for dst := range dsts { | |
| row[dst] = 1 | |
| } | |
| } | |
| } else { | |
| for src := range nodesSubset { | |
| dsts, ok := g.SourcesToDestinations[src] | |
| if !ok { | |
| continue | |
| } | |
| row := adj[src] | |
| for dst := range dsts { | |
| if _, ok := nodesSubset[dst]; ok { | |
| row[dst] = 1 | |
| } | |
| } | |
| } | |
| } | |
| return adj | |
| } |
| // writer to the same bitset. | ||
| func (b *bitset) test(i int) bool { | ||
| w := i >> 6 | ||
| return (b.w[w] & (1 << (uint(i) & 63))) != 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Address potential integer overflow in bitwise operations.
The static analysis correctly identifies potential integer overflows when converting int to uint in bitset operations. While these are unlikely to cause issues in practice (since bit positions should be within valid bounds), it's worth adding bounds checks or using consistent types.
Apply this diff to add bounds checking and handle potential overflows safely:
// test reads a bit without synchronization. Do not call concurrently with any
// writer to the same bitset.
func (b *bitset) test(i int) bool {
+ if i < 0 || i >= len(b.w)*64 {
+ return false
+ }
w := i >> 6
return (b.w[w] & (1 << (uint(i) & 63))) != 0
}
// set writes a bit without synchronization. Not safe to race with other
// accesses (reads or writes) to the same bitset.
func (b *bitset) set(i int) {
+ if i < 0 || i >= len(b.w)*64 {
+ return
+ }
w := i >> 6
b.w[w] |= 1 << (uint(i) & 63)
}
// testAndSetAtomic atomically sets bit i and returns true if it was already
// set. Safe for concurrent use by multiple goroutines.
func (b *bitset) testAndSetAtomic(i int) bool {
+ if i < 0 || i >= len(b.w)*64 {
+ return false
+ }
w := i >> 6
mask := uint64(1) << (uint(i) & 63)
addr := &b.w[w]
// clearAtomic atomically clears bit i. Safe for concurrent use by multiple
// goroutines.
func (b *bitset) clearAtomic(i int) {
+ if i < 0 || i >= len(b.w)*64 {
+ return
+ }
w := i >> 6
mask := ^(uint64(1) << (uint(i) & 63))
addr := &b.w[w]Also applies to: 225-225, 232-232, 249-249, 309-309, 515-515
🧰 Tools
🪛 GitHub Check: go-lint
[failure] 218-218:
G115: integer overflow conversion int -> uint (gosec)
🤖 Prompt for AI Agents
In pkg/sync/expand/scc/scc.go around lines 218, 225, 232, 249, 309 and 515, the
bitset operations convert int to uint and shift without bounds checks which can
overflow; add explicit bounds checking to ensure i is within 0..(len(b.w)*64-1)
(or non-negative and less than the expected bit capacity) before performing the
shift and index; return false or handle error when out of bounds, and use
consistent unsigned index types after validation to perform (b.w[w] & (1 <<
(uint(i) & 63))) safely (apply same pattern to all listed lines).
| // Defensive: assign singleton if anything slipped through | ||
| cid = nextID | ||
| nextID++ | ||
| comp[idx] = cid | ||
| if cid >= len(groups) { | ||
| tmp := make([][]int, cid+1) | ||
| copy(tmp, groups) | ||
| groups = tmp | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Defensive singleton handling may mask underlying bugs.
The defensive code at lines 353-362 that assigns singletons for any vertices with comp[idx] < 0 could hide bugs in the SCC algorithm. If the algorithm is working correctly, all vertices should have been assigned to components.
Consider either:
- Removing the defensive code and fixing any underlying issues that cause unassigned vertices
- Logging a warning when this case occurs to help debug the root cause
for idx := 0; idx < csr.N; idx++ {
cid := comp[idx]
if cid < 0 {
- // Defensive: assign singleton if anything slipped through
- cid = nextID
- nextID++
- comp[idx] = cid
- if cid >= len(groups) {
- tmp := make([][]int, cid+1)
- copy(tmp, groups)
- groups = tmp
- }
+ // This should not happen if the algorithm is correct
+ panic(fmt.Sprintf("vertex %d (node %d) was not assigned to any component", idx, csr.IdxToNodeID[idx]))
}
groups[cid] = append(groups[cid], csr.IdxToNodeID[idx])
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Defensive: assign singleton if anything slipped through | |
| cid = nextID | |
| nextID++ | |
| comp[idx] = cid | |
| if cid >= len(groups) { | |
| tmp := make([][]int, cid+1) | |
| copy(tmp, groups) | |
| groups = tmp | |
| } | |
| } | |
| for idx := 0; idx < csr.N; idx++ { | |
| cid := comp[idx] | |
| if cid < 0 { | |
| // This should not happen if the algorithm is correct | |
| panic(fmt.Sprintf("vertex %d (node %d) was not assigned to any component", idx, csr.IdxToNodeID[idx])) | |
| } | |
| groups[cid] = append(groups[cid], csr.IdxToNodeID[idx]) | |
| } |
8e14722 to
3725a3e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
pkg/sync/expand/graph.go (1)
312-346: Build adjacency in O(E_subset); init keys from subset; avoid nil row.
- When nodesSubset != nil, scanning all edges is O(E). Iterate sources in the subset to get O(E_subset).
- Initialize adj capacity and keys from nodesSubset ∩ g.Nodes to reduce work.
- Guard against a possible nil row if an unexpected src lacks a preinitialized entry.
- Optional: pre-size row by out-degree for fewer map grows.
func (g *EntitlementGraph) toAdjacency(nodesSubset map[int]struct{}) map[int]map[int]int { - adj := make(map[int]map[int]int, len(g.Nodes)) + capHint := len(g.Nodes) + if nodesSubset != nil { + capHint = len(nodesSubset) + } + adj := make(map[int]map[int]int, capHint) include := func(id int) bool { if nodesSubset == nil { return true } _, ok := nodesSubset[id] return ok } - // Ensure keys for all included nodes. - for id := range g.Nodes { - if include(id) { - adj[id] = make(map[int]int) - } - } + // Ensure keys for all included nodes. + if nodesSubset == nil { + for id := range g.Nodes { + adj[id] = make(map[int]int) + } + } else { + for id := range nodesSubset { + if _, ok := g.Nodes[id]; ok { + adj[id] = make(map[int]int) + } + } + } // Add edges where both endpoints are included. - for src, dsts := range g.SourcesToDestinations { - if !include(src) { - continue - } - row := adj[src] - for dst := range dsts { - if include(dst) { - row[dst] = 1 - } - } - } + if nodesSubset == nil { + for src, dsts := range g.SourcesToDestinations { + row := adj[src] + if row == nil { + row = make(map[int]int, len(dsts)) + adj[src] = row + } + for dst := range dsts { + row[dst] = 1 + } + } + } else { + for src := range nodesSubset { + dsts, ok := g.SourcesToDestinations[src] + if !ok { + continue + } + row := adj[src] + if row == nil { + row = make(map[int]int, len(dsts)) + adj[src] = row + } + for dst := range dsts { + if _, ok := nodesSubset[dst]; ok { + row[dst] = 1 + } + } + } + } return adj }
🧹 Nitpick comments (1)
pkg/sync/expand/graph.go (1)
348-372: BFS micro-opts: no reslicing; consistent empty-set return; clarify comment.
- Avoid queue reslicing to reduce allocator/GC pressure on large graphs.
- Return an empty set (not nil) when start is absent; update comment for callers’ simplicity.
-// reachableFrom computes the set of node IDs reachable from start over -// SourcesToDestinations using an iterative BFS. +// reachableFrom computes the set of node IDs reachable from start over +// SourcesToDestinations using an iterative BFS. Returns an empty set if start is not present. func (g *EntitlementGraph) reachableFrom(start int) map[int]struct{} { - if _, ok := g.Nodes[start]; !ok { - return nil - } - visited := make(map[int]struct{}, 16) - queue := make([]int, 0, 16) - queue = append(queue, start) - visited[start] = struct{}{} - for len(queue) > 0 { - u := queue[0] - queue = queue[1:] + if _, ok := g.Nodes[start]; !ok { + return map[int]struct{}{} + } + visited := make(map[int]struct{}, 32) + queue := make([]int, 0, 32) + queue = append(queue, start) + visited[start] = struct{}{} + for head := 0; head < len(queue); head++ { + u := queue[head] if nbrs, ok := g.SourcesToDestinations[u]; ok { for v := range nbrs { if _, seen := visited[v]; seen { continue } visited[v] = struct{}{} queue = append(queue, v) } } } return visited }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (7)
pkg/sync/expand/cycle.go(4 hunks)pkg/sync/expand/cycle_benchmark_test.go(2 hunks)pkg/sync/expand/graph.go(1 hunks)pkg/sync/expand/graph_test.go(7 hunks)pkg/sync/expand/scc/scc.go(1 hunks)pkg/sync/expand/scc/scc_test.go(1 hunks)pkg/sync/syncer.go(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (6)
- pkg/sync/syncer.go
- pkg/sync/expand/scc/scc_test.go
- pkg/sync/expand/graph_test.go
- pkg/sync/expand/cycle.go
- pkg/sync/expand/cycle_benchmark_test.go
- pkg/sync/expand/scc/scc.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Cursor Bugbot
- GitHub Check: go-test (1.23.x, windows-latest)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (7)
pkg/sync/expand/large_scc_benchmark_test.go (7)
13-13: Silence gosec G404 for deterministic benchmark RNGmath/rand is appropriate here; the randomness is non-security and seeded for reproducibility. Add a nolint to keep CI green.
- rng := rand.New(rand.NewSource(seed)) + rng := rand.New(rand.NewSource(seed)) //nolint:gosec // deterministic, non-crypto RNG for benchmarks
23-40: Make embedded cycles consistently length-cycleLenand avoid degeneration near NThe current stride/idx approach can yield truncated cycles and many duplicates when idx approaches N. A modulo-based wrap keeps cycles full and evenly spread.
- if cycleLen > 1 && numCycles > 0 { - stride := N / (numCycles + 1) - idx := 1 - for c := 0; c < numCycles; c++ { - start := idx - for k := 0; k < cycleLen-1 && idx < N; k++ { - add(idx, idx+1) - idx++ - } - if idx <= N { - add(idx, start) - } - idx += stride - if idx > N { - idx = N - } - } - } + if cycleLen > 1 && numCycles > 0 { + stride := N / numCycles + if stride == 0 { + stride = 1 + } + for c := 0; c < numCycles; c++ { + start := 1 + (c*stride)%N + for k := 0; k < cycleLen; k++ { + u := ((start + k - 1) % N) + 1 + v := ((start + k) % N) + 1 + add(u, v) + } + } + }Would you like me to add a quick post-generation check that at least one cycle is present?
41-46: Guard against pathological M to prevent an endless loopIf M exceeds the max unique non-self edges (N*(N-1)), the fill loop never terminates. Clamp the target or fail fast.
- for len(edgeSet) < M { + maxEdges := N * (N - 1) + target := M + if M > maxEdges { + target = maxEdges // alternatively: panic or return early + } + for len(edgeSet) < target { u := 1 + rng.Intn(N) v := 1 + rng.Intn(N) add(u, v) }
57-57: Set NextNodeID to next free IDMinor invariant: NextNodeID typically points past the highest allocated node.
- NextNodeID: N, + NextNodeID: N + 1,
101-106: Exclude graph build from timing to measure FixCycles preciselyBuilding the graph each iteration skews ns/op and allocs toward setup. Stop the timer during build if you want to benchmark FixCycles itself.
- for i := 0; i < b.N; i++ { - g := buildGraphFromEdges(N, edges) - if err := g.FixCycles(ctx); err != nil { - b.Fatalf("FixCycles error: %v", err) - } - } + for i := 0; i < b.N; i++ { + b.StopTimer() + g := buildGraphFromEdges(N, edges) + b.StartTimer() + if err := g.FixCycles(ctx); err != nil { + b.Fatalf("FixCycles error: %v", err) + } + }Note: ReportAllocs includes allocations outside the timer; isolating FixCycles allocs would require cloning a prebuilt graph or alternate harness.
109-109: Make long runs opt-in via env var instead of a constToggling with BENCH_LONG keeps defaults fast while enabling big cases in CI/local when desired. Requires importing os.
--- a/pkg/sync/expand/large_scc_benchmark_test.go +++ b/pkg/sync/expand/large_scc_benchmark_test.go @@ -import ( +import ( "context" "fmt" + "os" "math/rand" "testing" ) @@ -const enableLongTests = false +var enableLongTests = os.Getenv("BENCH_LONG") != "" @@ - if !testing.Short() && enableLongTests { + if !testing.Short() && enableLongTests { cases = append(cases, fullCases...) }Also applies to: 121-123, 3-8
12-12: Address gocritic captLocal: prefer lower-case param namesThe linter flags capitalized locals (N/M). Either rename to n/m or suppress at call sites. Renaming keeps signal-to-noise high.
Example (apply similarly to other funcs):
-func benchmarkFixCycles(b *testing.B, N, M, cycleLen, numCycles int) { +func benchmarkFixCycles(b *testing.B, n, m, cycleLen, numCycles int) { @@ - edges := generateEdges(N, M, cycleLen, numCycles, 1) + edges := generateEdges(n, m, cycleLen, numCycles, 1) @@ - g := buildGraphFromEdges(N, edges) + g := buildGraphFromEdges(n, edges)Also applies to: 55-55, 95-95
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
pkg/sync/expand/large_scc_benchmark_test.go(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
pkg/sync/expand/large_scc_benchmark_test.go (1)
pkg/sync/expand/graph.go (2)
EntitlementGraph(41-53)Edge(21-28)
🪛 GitHub Check: go-lint
pkg/sync/expand/large_scc_benchmark_test.go
[failure] 13-13:
G404: Use of weak random number generator (math/rand or math/rand/v2 instead of crypto/rand) (gosec)
[failure] 95-95:
captLocal: `N' should not be capitalized (gocritic)
[failure] 55-55:
captLocal: `N' should not be capitalized (gocritic)
[failure] 12-12:
captLocal: `N' should not be capitalized (gocritic)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Cursor Bugbot
- GitHub Check: go-test (1.23.x, windows-latest)
🔇 Additional comments (1)
pkg/sync/expand/large_scc_benchmark_test.go (1)
111-119: Sub-benchmark matrix looks goodClear sizing, readable names, and short/full split are sensible.
Summary by CodeRabbit
New Features
Refactor
Tests