You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Update prompt evaluation results with latest test data
- Update HYBRID_DESIGN scores: average improved from 86% to 89%
- Update DEFAULT scores: average improved from 82% to 85%
- Update SEQUENTIAL scores: average improved from 84% to 87%
- HYBRID_DESIGN maintains lead with highest average (89%)
- SEQUENTIAL shows strong performance in Bug Identification (94%)
- All prompts show improved performance across scenarios
The HYBRID_DESIGN prompt marginally demonstrated both the highest average solution quality (86%) and the most consistent performance across all scenarios, with no scores below 80%. It also prodouced the most thoughts. The `src/server.ts` file has been updated to use this optimal prompt design.
The HYBRID_DESIGN prompt demonstrates the highest average solution quality (89%) and the most consistent performance across all scenarios, with no scores below 80%. It also produces the most thoughts. The `src/server.ts` file has been updated to use this optimal prompt design.
138
138
139
139
Personally, I think the biggest improvement was adding this to the end of the prompt: "✍️ End each thought by asking: "What am I missing or need to reconsider?"
0 commit comments