Commit b59f98f
authored
use fancy-regex instead of onig as tokenizers regex library (#172)
The version of Oniguruma used in `onig_sys` doesn't build on GCC 15 and
the oniguruma project itself got archived last week, so this PR switches
tokenizers to the fancy-regex backend.
`fancy-regex` also requires flipping on the `unstable_wasm` feature
until huggingface/tokenizers#1772 lands, that flag doesn't have any ill
effects though since everything WASM related downstream is behind
`target_arch` checks.
**tl;dr**: This fixes builds on Linux distros with newer GCC versions
like Arch Linux and Fedora.1 parent c89e386 commit b59f98f
2 files changed
+14
-37
lines changedSome generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
11 | 11 | | |
12 | 12 | | |
13 | 13 | | |
14 | | - | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
15 | 18 | | |
0 commit comments