Releases: datalab-to/surya
Releases · datalab-to/surya
New Layout Model
Releasing new layout model
- Moving to a new architecture for layout, trained from scratch. Significant improvements across many domains
What's Changed
- feat: new unified tokenizer by @VikParuchuri in #442
- Layout Model Release by @tarun-menta in #461
- Dev by @tarun-menta in #463
Full Changelog: v0.16.7...v0.17.0
Move flash imports
What's Changed
- Move flash attention funcs by @VikParuchuri in #457
Full Changelog: v0.16.6...v0.16.7
Enable setting attention method
What's Changed
- Get rid of attention method checks by @VikParuchuri in #455
- Enable setting attention method by @VikParuchuri in #456
Full Changelog: v0.16.5...v0.16.6
Minor init update
Speed improvements; SDPA fix
- Fix attention mask for SDPA
- Improve performance 20-30%, more so with marker
- Add some more model loader options
What's Changed
- Backport by @VikParuchuri in #449
- Foundation Model Performance Improvements by @tarun-menta in #451
- Dev by @VikParuchuri in #452
- Bump version by @VikParuchuri in #453
Full Changelog: v0.16.3...v0.16.4
v0.16.3
Revert checkpoint
Full Changelog: v0.16.2...v0.16.3
Multi-token inference + Better Math/Tables
What's Changed
- Update README by @u-ashish in #447
- feat: multi-token decoding by @zanussbaum in #446
- Dev by @zanussbaum in #448
New Contributors
Full Changelog: v0.16.1...v0.16.2
Transformers fix
Hotfix to be compatible with transformers 4.56.0.
What's Changed
- Remove unused import by @VikParuchuri in #444
Full Changelog: v0.16.0...v0.16.1
New OCR Model
General OCR Improvements
- Update to a better OCR model. Uses less vocab size, and a more performant vision encoder.
- Improved math performance, with less spurious use of math tags when not needed
Misc
- Fix table model by pinning to CPU on MPS devices
What's Changed
- update commercial license description in README and update LICENSE to… by @sandy0kwon in #434
- Dev by @VikParuchuri in #436
- Model Update: New Tokenizer and Encoder by @tarun-menta in #440
- Dev by @tarun-menta in #441
New Contributors
- @sandy0kwon made their first contribution in #434
Full Changelog: v0.15.4...v0.16.0
Improved Math OCR Model
Math OCR improvements
- Bump the OCR model with a version that improves general performance, and significantly improves math performance
What's Changed
- Improve model performance on math by @tarun-menta in #429
Full Changelog: v0.15.3...v0.15.4