Skip to content

Conversation

@benjibc
Copy link
Contributor

@benjibc benjibc commented Nov 14, 2025

Note

Adds new eval_protocol benchmarks, models, rewards, pytest rollout processors with tests, and includes a Vite build asset map.

  • Eval Protocol:
    • Add eval_protocol/benchmarks/* (e.g., test_aime25.py, test_gpqa.py, test_livebench_data_analysis.py).
    • Add eval_protocol/models.py.
    • Add pytest rollout processors in eval_protocol/pytest/*.
    • Add reward implementations in eval_protocol/rewards/* (e.g., accuracy.py, json_schema.py, language_consistency.py, repetition.py, tag_count.py).
  • Tests:
    • Add tests/pytest/test_single_turn_rollout_processor.py.
  • Frontend/Build:
    • Add vite-app/dist/assets/index-CuQbfdPD.js.map (bundled asset).

Written by Cursor Bugbot for commit 8a5130a. This will update automatically on new commits. Configure here.

@benjibc benjibc force-pushed the vision_food_reasoning branch from 74b5326 to 7ee0f25 Compare December 2, 2025 07:26
@benjibc benjibc force-pushed the vision_food_reasoning branch from 7ee0f25 to f9223ac Compare December 2, 2025 07:32
@benjibc benjibc force-pushed the vision_food_reasoning branch from f9223ac to 4c37b8c Compare December 3, 2025 01:26
@benjibc benjibc force-pushed the vision_food_reasoning branch from 4c37b8c to 391d8b7 Compare December 3, 2025 01:40
@benjibc benjibc merged commit 01bc8e9 into main Dec 3, 2025
9 checks passed
@benjibc benjibc deleted the vision_food_reasoning branch December 3, 2025 05:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants