Skip to content

Make cross-review default for multi-agent; add --no-cross-review flag#37

Merged
chefsale merged 10 commits intomasterfrom
feat/cross-review-default
Feb 11, 2026
Merged

Make cross-review default for multi-agent; add --no-cross-review flag#37
chefsale merged 10 commits intomasterfrom
feat/cross-review-default

Conversation

@chefsale
Copy link
Member

@chefsale chefsale commented Feb 11, 2026

Note

Medium Risk
Changes review output semantics by filtering/reordering findings via a new cross-review pass and alters when REQUEST_CHANGES is posted, which can affect CI/merge gating and reviewer signal.

Overview
Multi-agent reviews now default to 3 agents (CLI, webhook env defaults, and the ai-review GitHub Actions workflow), and a new cross-review second round is enabled by default for --agents>=2.

This adds --no-cross-review and --min-agreement controls (plus webhook env vars like ENABLE_CROSS_REVIEW/MIN_VALIDATION_AGREEMENT) and implements logic to have agents validate and rank consolidated findings, dropping low-agreement items and reordering by consensus.

GitHub review posting behavior is softened to only use REQUEST_CHANGES for critical findings; warnings/suggestions/nitpicks now result in COMMENT so authors aren’t blocked. Tests are expanded to cover the new cross-review parsing/filtering and the updated review-action rules.

Written by Cursor Bugbot for commit 8d5a5c6. This will update automatically on new commits. Configure here.

Copy link
Member

@xilosada xilosada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 MeroReviewer

Reviewed by 2 agents | Quality score: 33% | Review time: 148.4s

🟡 1 warnings, 💡 2 suggestions, 📝 1 nitpicks. See inline comments.


🤖 Generated by MeroReviewer | Review ID: review-e3bdba4a

Copy link
Member

@xilosada xilosada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 MeroReviewer

Reviewed by 3 agents | Quality score: 33% | Review time: 186.6s

✅ 3 fixed | 🆕 4 new. See inline comments.


🤖 Generated by MeroReviewer | Review ID: review-58d0e720

@xilosada
Copy link
Member

Reviewed by 2 agents | Quality score: 33% | Review time: 148.4s

I think the Quality score isn't working at all

Copy link
Member

@xilosada xilosada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 MeroReviewer

Reviewed by 3 agents | Quality score: 33% | Review time: 1037.6s

✅ 7 fixed | 🆕 6 new. See inline comments.


🤖 Generated by MeroReviewer | Review ID: review-0c42e713

Copy link
Member

@xilosada xilosada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 MeroReviewer

Reviewed by 3 agents | Quality score: 33% | Review time: 226.3s

✅ 13 fixed | 🆕 6 new. See inline comments.


🤖 Generated by MeroReviewer | Review ID: review-9aed8352

Copy link
Contributor

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

…s; test finding_id alias and no false re-ranked

Co-authored-by: Cursor <[email protected]>
Copy link
Member

@xilosada xilosada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 MeroReviewer

Reviewed by 3 agents | Quality score: 33% | Review time: 461.1s

✅ 19 fixed | 🆕 1 new. See inline comments.


🤖 Generated by MeroReviewer | Review ID: review-6bbd4a7a

Copy link
Contributor

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

return assessments, summary
except json.JSONDecodeError as e:
logger.warning(f"Failed to parse cross-review JSON: {e}")
return [], ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicated JSON parsing logic across two functions

Low Severity

parse_cross_review_response nearly duplicates parse_review_response — both share identical markdown-fence extraction, regex-based JSON matching, and try/except parsing structure. The only differences are the regex key ("assessments" vs "findings"), the extracted data key, and default return values. A shared helper like _extract_json_from_llm_response(content, key) would eliminate ~14 lines of duplicated logic and ensure any future parsing fix applies to both paths. The same pattern also exists in CursorClient._parse_json_response.

Additional Locations (1)

Fix in Cursor Fix in Web

Copy link
Member

@xilosada xilosada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 MeroReviewer

Reviewed by 3 agents | Quality score: 33% | Review time: 241.3s

✅ 20 fixed | 🆕 2 new. See inline comments.


🤖 Generated by MeroReviewer | Review ID: review-3cf5e9fd

Copy link
Contributor

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

rank = max(1, int(rank))
else:
rank = 99
id_to_votes[fid].append((valid, rank))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

String "false" for valid treated as truthy

Medium Severity

The rank field gets defensive type checking (isinstance guard), but the valid field at line 310 does not. If an LLM returns "valid": "false" as a JSON string instead of a boolean, json.loads produces the Python string "false", which is truthy. The vote-counting expression sum(1 for v, _ in votes if v) then counts it as a valid vote, completely inverting the agent's intended assessment. This silently prevents cross-review from ever dropping that finding.

Additional Locations (1)

Fix in Cursor Fix in Web

Copy link
Member

@xilosada xilosada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 MeroReviewer

Reviewed by 3 agents | Quality score: 33% | Review time: 331.2s

✅ 19 fixed | 🆕 1 new. See inline comments.


🤖 Generated by MeroReviewer | Review ID: review-0b77fa96

Copy link
Contributor

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

- Recompute review_quality_score in apply_cross_review from valid_ratio
  (avg_valid_ratio * agent_factor) so displayed score reflects post-filter state
- Coerce LLM 'valid' field to bool (string 'false' -> False) to avoid
  incorrect votes when JSON returns string instead of boolean
- Add tests: quality_score recalculation, valid string coercion

Co-authored-by: Cursor <[email protected]>
Copy link
Member

@xilosada xilosada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 MeroReviewer

Reviewed by 3 agents | Quality score: 89% | Review time: 331.9s

✅ 19 fixed | 🆕 3 new. See inline comments.


🤖 Generated by MeroReviewer | Review ID: review-a2b6acd5

Copy link
Member

@xilosada xilosada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 MeroReviewer

Reviewed by 3 agents | Quality score: 95% | Review time: 288.1s

✅ Ready to Merge

All previously identified issues have been addressed!


✅ Fixed Issues

The following issues from previous reviews have been addressed:
  1. Cross-review agents run sequentially, not in parallel (src/ai_reviewer/review.py:635)
  2. min_validation_agreement is hardcoded; consider exposing as CLI flag (src/ai_reviewer/cli.py:172)
  3. Nit: Summary always appends cross-review note even if nothing changed (src/ai_reviewer/review.py:348)
  4. New cross-review functionality lacks test coverage (src/ai_reviewer/review.py:194)
  5. Diff truncation may cut mid-line or mid-file (src/ai_reviewer/review.py:232)
  6. Nit: min-agreement behavior with 2 agents may surprise users (src/ai_reviewer/cli.py:67)
  7. False positive for 'order_changed' when findings are dropped (src/ai_reviewer/review.py:340)
  8. Unvalidated environment variable parsing can cause service denial (src/ai_reviewer/github/webhook.py:157)
  9. Test does not catch the order_changed bug (tests/test_review.py:593)
  10. Cross-review runs all agents regardless of round-1 failures (src/ai_reviewer/review.py:765)
  11. Findings with no votes get rank 99, silently pushed to end (src/ai_reviewer/review.py:316)
  12. Missing test for finding_id key alias in assessments (tests/test_review.py:470)
  13. Silent cross-review agent failures reduce validation quality (src/ai_reviewer/review.py:308)
  14. Greedy regex may over-match JSON content (src/ai_reviewer/review.py:270)
  15. Diff truncation could leave incomplete diff hunk header (src/ai_reviewer/review.py:232)
  16. DRY: JSON extraction logic duplicated between parse functions (src/ai_reviewer/review.py:262)
  17. Nit: Undocumented behavior for findings with zero votes (src/ai_reviewer/review.py:299)
  18. Greedy regex may capture extra content in edge cases (src/ai_reviewer/review.py:275)
  19. Consider logging when findings exceed cross-review limit (src/ai_reviewer/review.py:197)
  20. Consider documenting the cross-review cost tradeoff (src/ai_reviewer/review.py:267)
  21. Nit: Module-level constant mutation in test (tests/test_review.py:601)
  22. Nit: Greedy regex could over-match with multiple JSON objects (src/ai_reviewer/review.py:276)

🤖 Generated by MeroReviewer | Review ID: review-525ad5bf

@chefsale chefsale merged commit dea5afe into master Feb 11, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants