Skip to content

TRACE Model Fine-Tuning Evaluation Issue: Reported vs. Observed Metrics Mismatch #8

@Rorwey

Description

@Rorwey

Hello, I encountered an issue where the results I obtained using the fine-tuned model provided by the authors (MODEL_DIR="model/trace-ft-youcook2") are much closer to the TRACE-UNI baseline and significantly different from the values reported in the paper. Specifically, my metrics are SODA_c_2: 2.3, F1_Score: 18.5, and CIDER: 7.5, whereas the paper reports SODA_c_2: 6.7, F1_Score: 31.8, and CIDER: 35.5.
I followed the evaluation script (trace/eval/eval.sh) as instructed. Could there be any specific parameters or settings required for evaluating the fine-tuned model that I might have overlooked?
Any guidance would be greatly appreciated.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions