Adding @huggingface/transformers as a direct dependency pulls in onnxruntime-node (native binary, ~100MB+), onnxruntime-web, and sharp as non-optional dependencies. This significantly increases the Docker image size and install time, even though the ONNX runtime is never used (only the tokenizer is needed).
Consider using @huggingface/jinja + a lighter tokenizer-only package, or using tokenizers (the Rust-backed package) instead. Alternatively, since only BAAI/bge-m3 tokenization is needed, you could download and bundle just the tokenizer.json file and use a minimal SentencePiece/BPE decoder, avoiding the entire transformers dependency tree.
Originally posted by @Copilot in #672 (comment)
Adding
@huggingface/transformersas a direct dependency pulls inonnxruntime-node(native binary, ~100MB+),onnxruntime-web, andsharpas non-optional dependencies. This significantly increases the Docker image size and install time, even though the ONNX runtime is never used (only the tokenizer is needed).Consider using
@huggingface/jinja+ a lighter tokenizer-only package, or usingtokenizers(the Rust-backed package) instead. Alternatively, since onlyBAAI/bge-m3tokenization is needed, you could download and bundle just thetokenizer.jsonfile and use a minimal SentencePiece/BPE decoder, avoiding the entire transformers dependency tree.Originally posted by @Copilot in #672 (comment)