Skip to content

feat: add nomic-embed-text-v2-moe via Candle backend#228

Merged
Anush008 merged 3 commits intoAnush008:mainfrom
samvallad33:feat/nomic-v2-moe
Feb 19, 2026
Merged

feat: add nomic-embed-text-v2-moe via Candle backend#228
Anush008 merged 3 commits intoAnush008:mainfrom
samvallad33:feat/nomic-v2-moe

Conversation

@samvallad33
Copy link
Copy Markdown
Contributor

Summary

Adds support for nomic-embed-text-v2-moe, the first general-purpose Mixture-of-Experts embedding model (475M total / 305M active params, 8 experts top-2 routing). It outperforms nomic-embed-text-v1.5 on both BEIR and MIRACL with ~100 language support.

No ONNX export exists for this model because MoE dynamic routing cannot be cleanly traced/exported. This implements the full NomicBert+MoE architecture in candle-nn, following the same standalone pattern established by Qwen3.

Architecture

  • Post-norm NomicBert encoder (12 layers)
  • Alternating dense/MoE FFN layers — megablocks stacked-weight convention
  • 8 experts, top-2 routing per MoE layer
  • Combined Wqkv projection with RoPE (non-interleaved, fraction=1.0)
  • Bidirectional attention (encoder, no causal mask)
  • Mean pooling + L2 normalization
  • XLMRoberta tokenizer (250k vocab, 100+ languages)

Integration

  • Feature-gated: nomic-v2-moe = ["dep:candle-core", "dep:candle-nn", "hf-hub"]
  • Also added to mkl, accelerate, cuda, cudnn, metal feature sets
  • Top-level struct: NomicV2MoeTextEmbedding with from_hf() + embed()
  • Re-exported from lib.rs behind #[cfg(feature = "nomic-v2-moe")]

Files changed

  • Cargo.toml — feature flag additions
  • src/models/mod.rs — module registration
  • src/models/nomic_v2_moe.rs — full model implementation (~830 lines)
  • src/lib.rs — re-export
  • tests/nomic_v2_moe.rs — embedding quality test

Test results

q0-d0: 0.6596, q0-d1: 0.0593    (China/Beijing query matches Beijing doc)
q1-d0: 0.1181, q1-d1: 0.6362    (Gravity query matches gravity doc)

Closes #227

Test plan

  • cargo check --features nomic-v2-moe — clean build, zero warnings
  • cargo test --features nomic-v2-moe — embedding quality assertions pass
  • L2 normalization verified (norm ≈ 1.0 within 1e-4)
  • Cosine similarity ordering correct (matched > mismatched)
  • CI (triggered by PR)

Add support for nomic-ai/nomic-embed-text-v2-moe, the first
general-purpose Mixture-of-Experts embedding model (475M total /
305M active params, 8 experts top-2 routing).

No ONNX export exists for this model due to MoE dynamic routing,
so this implements the full NomicBert+MoE architecture using
candle-nn, following the same standalone pattern as Qwen3.

Architecture highlights:
- Post-norm NomicBert encoder (12 layers)
- Alternating dense/MoE FFN layers (megablocks convention)
- Combined Wqkv projection with RoPE (non-interleaved)
- Bidirectional attention (no causal mask)
- Mean pooling + L2 normalization
- XLMRoberta tokenizer (250k vocab, 100+ languages)

Feature-gated behind `nomic-v2-moe`, same candle deps as `qwen3`.

Closes Anush008#227
@samvallad33 samvallad33 marked this pull request as draft February 19, 2026 08:04
- Replace manual NomicLayerNorm with candle_nn::LayerNorm (-25 lines,
  battle-tested impl with proper F16/BF16 upcast handling)
- Fix NaN panic in MoE router: partial_cmp().unwrap() → total_cmp()
- Fix pad_token "[PAD]" → "<pad>" (XLMRoberta convention)
- Pre-compute attention scale tensor (was allocating per forward call)
- Fix rotary_dim: store full dim instead of halved-then-doubled
- Fix misleading dropout comment (was citing wrong config value)
- Add note about CPU-side MoE dispatch for GPU awareness
- Fix clippy: manual modulo → is_multiple_of()

Net -26 lines. Zero warnings, identical test scores.
- Run cargo fmt to pass upstream CI's format check
- Add nomic-embed-text-v2-moe to supported models list in README
- Add usage section with Cargo.toml feature flag and code example
Copy link
Copy Markdown
Owner

@Anush008 Anush008 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking the time to contribute @samvallad33
LGTM!

@Anush008 Anush008 merged commit ccffb98 into Anush008:main Feb 19, 2026
1 check passed
github-actions bot pushed a commit that referenced this pull request Feb 19, 2026
## [5.11.0](v5.10.0...v5.11.0) (2026-02-19)

### 🍕 Features

* Add nomic-embed-text-v2-moe ([#228](#228)) ([ccffb98](ccffb98)), closes [#227](#227)
@github-actions
Copy link
Copy Markdown

🎉 This PR is included in version 5.11.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: Add nomic-embed-text-v2-moe support via Candle backend

2 participants