feat: add nomic-embed-text-v2-moe via Candle backend by samvallad33 · Pull Request #228 · Anush008/fastembed-rs

samvallad33 · 2026-02-19T08:04:28Z

Summary

Adds support for nomic-embed-text-v2-moe, the first general-purpose Mixture-of-Experts embedding model (475M total / 305M active params, 8 experts top-2 routing). It outperforms nomic-embed-text-v1.5 on both BEIR and MIRACL with ~100 language support.

No ONNX export exists for this model because MoE dynamic routing cannot be cleanly traced/exported. This implements the full NomicBert+MoE architecture in candle-nn, following the same standalone pattern established by Qwen3.

Architecture

Post-norm NomicBert encoder (12 layers)
Alternating dense/MoE FFN layers — megablocks stacked-weight convention
8 experts, top-2 routing per MoE layer
Combined Wqkv projection with RoPE (non-interleaved, fraction=1.0)
Bidirectional attention (encoder, no causal mask)
Mean pooling + L2 normalization
XLMRoberta tokenizer (250k vocab, 100+ languages)

Integration

Feature-gated: nomic-v2-moe = ["dep:candle-core", "dep:candle-nn", "hf-hub"]
Also added to mkl, accelerate, cuda, cudnn, metal feature sets
Top-level struct: NomicV2MoeTextEmbedding with from_hf() + embed()
Re-exported from lib.rs behind #[cfg(feature = "nomic-v2-moe")]

Files changed

Cargo.toml — feature flag additions
src/models/mod.rs — module registration
src/models/nomic_v2_moe.rs — full model implementation (~830 lines)
src/lib.rs — re-export
tests/nomic_v2_moe.rs — embedding quality test

Test results

q0-d0: 0.6596, q0-d1: 0.0593    (China/Beijing query matches Beijing doc)
q1-d0: 0.1181, q1-d1: 0.6362    (Gravity query matches gravity doc)

Closes #227

Test plan

cargo check --features nomic-v2-moe — clean build, zero warnings
cargo test --features nomic-v2-moe — embedding quality assertions pass
L2 normalization verified (norm ≈ 1.0 within 1e-4)
Cosine similarity ordering correct (matched > mismatched)
CI (triggered by PR)

Add support for nomic-ai/nomic-embed-text-v2-moe, the first general-purpose Mixture-of-Experts embedding model (475M total / 305M active params, 8 experts top-2 routing). No ONNX export exists for this model due to MoE dynamic routing, so this implements the full NomicBert+MoE architecture using candle-nn, following the same standalone pattern as Qwen3. Architecture highlights: - Post-norm NomicBert encoder (12 layers) - Alternating dense/MoE FFN layers (megablocks convention) - Combined Wqkv projection with RoPE (non-interleaved) - Bidirectional attention (no causal mask) - Mean pooling + L2 normalization - XLMRoberta tokenizer (250k vocab, 100+ languages) Feature-gated behind `nomic-v2-moe`, same candle deps as `qwen3`. Closes Anush008#227

- Replace manual NomicLayerNorm with candle_nn::LayerNorm (-25 lines, battle-tested impl with proper F16/BF16 upcast handling) - Fix NaN panic in MoE router: partial_cmp().unwrap() → total_cmp() - Fix pad_token "[PAD]" → "<pad>" (XLMRoberta convention) - Pre-compute attention scale tensor (was allocating per forward call) - Fix rotary_dim: store full dim instead of halved-then-doubled - Fix misleading dropout comment (was citing wrong config value) - Add note about CPU-side MoE dispatch for GPU awareness - Fix clippy: manual modulo → is_multiple_of() Net -26 lines. Zero warnings, identical test scores.

- Run cargo fmt to pass upstream CI's format check - Add nomic-embed-text-v2-moe to supported models list in README - Add usage section with Cargo.toml feature flag and code example

Anush008

Thanks for taking the time to contribute @samvallad33
LGTM!

## [5.11.0](v5.10.0...v5.11.0) (2026-02-19) ### 🍕 Features * Add nomic-embed-text-v2-moe ([#228](#228)) ([ccffb98](ccffb98)), closes [#227](#227)

github-actions · 2026-02-19T09:46:40Z

🎉 This PR is included in version 5.11.0 🎉

The release is available on:

GitHub release
v5.11.0

Your semantic-release bot 📦🚀

samvallad33 marked this pull request as draft February 19, 2026 08:04

samvallad33 added 2 commits February 19, 2026 02:15

chore: cargo fmt + add README documentation for nomic-v2-moe

f62bc05

- Run cargo fmt to pass upstream CI's format check - Add nomic-embed-text-v2-moe to supported models list in README - Add usage section with Cargo.toml feature flag and code example

samvallad33 marked this pull request as ready for review February 19, 2026 08:22

samvallad33 mentioned this pull request Feb 19, 2026

feat: Add nomic-embed-text-v2-moe support via Candle backend #227

Closed

Anush008 approved these changes Feb 19, 2026

View reviewed changes

Anush008 merged commit ccffb98 into Anush008:main Feb 19, 2026
1 check passed

github-actions bot pushed a commit that referenced this pull request Feb 19, 2026

chore(release): 5.11.0 [skip ci]

5dd8d79

## [5.11.0](v5.10.0...v5.11.0) (2026-02-19) ### 🍕 Features * Add nomic-embed-text-v2-moe ([#228](#228)) ([ccffb98](ccffb98)), closes [#227](#227)

github-actions bot added the released label Feb 19, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add nomic-embed-text-v2-moe via Candle backend#228

feat: add nomic-embed-text-v2-moe via Candle backend#228
Anush008 merged 3 commits intoAnush008:mainfrom
samvallad33:feat/nomic-v2-moe

samvallad33 commented Feb 19, 2026

Uh oh!

Anush008 left a comment

Uh oh!

Uh oh!

github-actions bot commented Feb 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

samvallad33 commented Feb 19, 2026

Summary

Architecture

Integration

Files changed

Test results

Test plan

Uh oh!

Anush008 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions bot commented Feb 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants