Skip to content

Add glm4_moe_lite (GLM-4.7-Flash) model support for OpenVINO export#1

Draft
as-suvorov wants to merge 1 commit intoas/model_enablement_transformers_v5from
as/add-glm4-moe-lite-support
Draft

Add glm4_moe_lite (GLM-4.7-Flash) model support for OpenVINO export#1
as-suvorov wants to merge 1 commit intoas/model_enablement_transformers_v5from
as/add-glm4-moe-lite-support

Conversation

@as-suvorov
Copy link
Copy Markdown
Owner

  • Register Glm4MoeLiteOpenVINOConfig with MLA-style PKV generator
  • Add Glm4MoeLitePatcher switching experts to batched_mm for trace-safe export
  • Add tests for decoder, export, CLI, and quantization
  • Update documentation with GLM-4.7-Flash entry
  • Requires transformers >= 5.0.0

What does this PR do?

Fixes # (issue)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

- Register Glm4MoeLiteOpenVINOConfig with MLA-style PKV generator
- Add Glm4MoeLitePatcher switching experts to batched_mm for trace-safe export
- Add tests for decoder, export, CLI, and quantization
- Update documentation with GLM-4.7-Flash entry
- Requires transformers >= 5.0.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant