[Cross-entropy-loss] return mean token accuracy metric with CE loss #910

kashif · 2025-10-16T19:47:03Z

Summary

Returns the mean token accuracy metric when minimizing the cross-entropy loss without materializing the logits

https://x.com/jeremyphoward/status/1703246293802586155

Testing Done

Hardware Type:
run make test to ensure correctness
run make checkstyle to ensure code style
run make test-convergence to ensure convergence

src/liger_kernel/transformers/functional.py

src/liger_kernel/transformers/model/falcon_h1.py

kashif · 2025-10-20T09:57:59Z

@vaibhavjindal would you be able to kindly review?

kashif · 2025-10-28T11:15:13Z

@shimizust this will be a breaking change i believe BTW

vaibhavjindal · 2025-11-03T21:13:23Z

@kashif could you please elaborate on how it will be a breaking change? Will it break the intergration with transformers or trl?

kashif · 2025-11-03T21:17:16Z

yes if someone is using the raw functions in their lib. then now that functions returns one more thing... but on the HF side this PR takes care of this

kashif · 2025-11-03T21:18:36Z

@vaibhavjindal see here https://github.com/linkedin/Liger-Kernel/pull/910/files#diff-7654a885e261ec4986c229d953681635ab4c8761c89342d2dde604f169783b35L350

vaibhavjindal · 2025-11-03T21:35:06Z

@kashif got it. So if i understand correctly, it will make sure that liger remains compatible with newer versions from HF. However, just want to confirm it will break liger support with older transformers/trl versions?

kashif · 2025-11-03T21:42:28Z

no i believe my changes here will work with older version of HF.. i just meant non-HF frameworks

kashif · 2025-11-03T21:43:34Z

TRL relies on HF integration for the CE loss so in TRL I will just pin to the liger version that has these changes

kashif · 2025-11-05T16:02:04Z

@vaibhavjindal let me fix up the new qwen3-vl model to update its API

kashif · 2025-11-05T21:02:48Z

@vaibhavjindal all good from my side

vaibhavjindal · 2025-11-05T21:04:41Z

@vaibhavjindal all good from my side

Thanks a lot! I will do some final checks on correctness and benchmarks and will try to get it merged soon.

kashif · 2025-11-05T21:07:23Z

thank you so much.. also see here: huggingface/trl#4302 (comment)

kashif · 2025-11-06T08:01:27Z

thanks @vaibhavjindal for the typo fix and making it more robust!

kashif added 10 commits October 16, 2025 19:46

add return_token_accuracy flag to fused_linear_cross_entropy

a254769

rename to token_accuracy

b670b7d

return token_accuracy in transformer models

e872bf4

formatting

d11b24d

add missing output class

002c0ec

typos

d67c511

more typos

3a4a883

added test_correctness_with_token_accuracy

1e6da16

formatting

e9d0954

consistency

038035d

albertvillanova reviewed Oct 17, 2025

View reviewed changes

src/liger_kernel/transformers/functional.py Outdated Show resolved Hide resolved

albertvillanova reviewed Oct 17, 2025

View reviewed changes

src/liger_kernel/transformers/functional.py Outdated Show resolved Hide resolved

albertvillanova reviewed Oct 17, 2025

View reviewed changes

src/liger_kernel/transformers/model/falcon_h1.py Outdated Show resolved Hide resolved

kashif mentioned this pull request Oct 18, 2025

[SFT] Log mean token accuracy from Liger kernel huggingface/trl#4302

Open

5 tasks

kashif added 3 commits October 20, 2025 11:46

use CrossEntropyOutput

2212623

Merge branch 'main' into mean_token_accuracy

33a999b

update qwen3 next

a50e03e

kashif and others added 5 commits October 20, 2025 12:04

formatting

338e70a

add missing return_dict

d1d9f52

Merge branch 'main' into mean_token_accuracy

c5857fd

Merge branch 'main' into mean_token_accuracy

ddfdb0b

Merge branch 'main' into mean_token_accuracy

f268c27

shimizust assigned vaibhavjindal Oct 28, 2025

kashif changed the title ~~[Cross-entropy-loss] add return_token_accuracy flag to fused_linear_cross_entropy~~ [Cross-entropy-loss] return mean token accuracy metric with CE loss Nov 1, 2025

Merge branch 'main' into mean_token_accuracy

704c3b4

kashif and others added 5 commits November 5, 2025 17:02

Merge branch 'main' into mean_token_accuracy

c6c2d27

checktyle fixes

181b11f

Merge branch 'main' into mean_token_accuracy

a06c5db

fix qwen3_vl

0069dcf

checkstyle

3ef06ee

vaibhavjindal added 3 commits November 5, 2025 14:49

Merge branch 'main' into mean_token_accuracy

f855c29

fix circular import

dd0790c

fix output classes for different transformers versions

d20e8b6

vaibhavjindal approved these changes Nov 5, 2025

View reviewed changes

vaibhavjindal merged commit 7dd8ecc into linkedin:main Nov 5, 2025
3 of 7 checks passed

[Cross-entropy-loss] return mean token accuracy metric with CE loss #910

[Cross-entropy-loss] return mean token accuracy metric with CE loss #910

Uh oh!

Conversation

kashif commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Testing Done

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kashif commented Oct 20, 2025

Uh oh!

kashif commented Oct 28, 2025

Uh oh!

vaibhavjindal commented Nov 3, 2025

Uh oh!

kashif commented Nov 3, 2025

Uh oh!

kashif commented Nov 3, 2025

Uh oh!

vaibhavjindal commented Nov 3, 2025

Uh oh!

kashif commented Nov 3, 2025

Uh oh!

kashif commented Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kashif commented Nov 5, 2025

Uh oh!

kashif commented Nov 5, 2025

Uh oh!

vaibhavjindal commented Nov 5, 2025

Uh oh!

kashif commented Nov 5, 2025

Uh oh!

Uh oh!

kashif commented Nov 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

kashif commented Oct 16, 2025 •

edited

Loading

kashif commented Nov 3, 2025 •

edited

Loading