[OpenVINO] Add agentic skill for adding new model support by rkazants · Pull Request #1616 · huggingface/optimum-intel

rkazants · 2026-02-18T06:16:33Z

What does this PR do?

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

HuggingFaceDocBuilderDev · 2026-02-18T06:19:01Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copilot

Pull request overview

This PR adds an agentic skill documentation file that provides comprehensive guidance for adding support for new model architectures from HuggingFace transformers and diffusers libraries to the optimum-intel project. The skill enables model export to OpenVINO IR format and inference through the optimum-intel API.

Changes:

Added a detailed skill documentation file (skills/SKILL.md) containing workflows, code examples, and reference materials for implementing new model support in optimum-intel
Included practical examples for model architecture analysis and patching patterns (particularly Mixture of Experts)
Provided references to test files, documentation locations, and external resources

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

skills/SKILL.md

skills/adding-new-model-support/SKILL.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

skills/adding-new-model-support/SKILL.md

Add quantization test requirement in SKILL.md.

SearchSavior · 2026-03-25T18:34:34Z

@rkazants @popovaan @IlyasMoutawwakil @echarlaix @ljaljushkin
Hey guys!

This skill is a fantastic addition to optimum. OpenVINO repositories have good user documentation, but for development and contribution it remains difficult for new contributors to find somewhere to start. Though I haven't contributed yet, I am advanced openvino user and still struggle to see how exactly optimum fits together. Even with source code study, tools like deepwiki, reading PRs from your team openvino repositories demand high level of skill from developers to even start.

That said, this skill takes an approach to documentation that could be a healthy direction. Maintainer perspective on the procedure of model adding is valuable, and largely missing from this repository. See this discussion case;

The original code contains a conditional branch inside a Python for-loop. For certain example inputs, this branch may be skipped during tracing, resulting in an incorrect or incomplete final graph. Additionally, the non-vectorized implementation produces a very large OpenVINO graph with excessive nodes, which is expensive for graph transformations and significantly increases model conversion time. So here is the patch that provides a vectorized form of MoE....

Look, I run OpenArc, and we have a discord server with a ton of users... we often discuss the intel ecosystem. One topic that we identify as a tension, are the excellent performance of openvino vs its lacking SOTA open source model support, again contrast against the difficulty of adding support.

I think more documentation like this skill could encourage a healthy contribution environment. Plus, I learned a ton from reading this skill about optimum, and applied some its lessons to my adventures from-scratching qwen-asr/qwen-tts, and had some insights into why my attempt to patch glm-4.7 flash failed.

Overall, openvino documentation lacks hands on discussion and I encourage your team to devote more development time to this sort of addition. Learning openvino needs to be easier!

Thanks for your work, I am always learning so much from everyone who contributes!!

as-suvorov · 2026-04-01T08:38:41Z

@rkazants Used this skill to support https://huggingface.co/zai-org/GLM-4.7-Flash
PR: as-suvorov#1

Perf with GenAI:
1st token latency | 2339.79 ms
2nd token latency | 425.86 ms/token
Throughput | 2.35 tokens/s

The model is too big to run transformers ground truth with WWB, gets OOM killed (128gb ram).
wwb similarity for int4 model optimum-intel vs GenAI: 0.9742136

Proposal to skill improvement: instruct model to clone appropriate transformers version to workspace. Agent tries to read transformer source with custom bash commands or python scripts. I believe it will be more efficient to clone sources and use tool calls.

[OpenVINO] Add agentic skill for adding new model support

34b4866

rkazants added 3 commits February 18, 2026 10:47

Refine agentic skill

218be7b

Skill to update documentation

f2b226e

Refine create tests step in SKILL.md

3308996

rkazants requested a review from Copilot February 19, 2026 04:43

Copilot started reviewing on behalf of rkazants February 19, 2026 04:44 View session

Copilot AI reviewed Feb 19, 2026

View reviewed changes

rkazants and others added 6 commits February 19, 2026 16:25

Define preliminary structure of agents

285555a

Delete duplicate SKILL.md

6e46644

Remove duplicate SKILL.md

1e2a9fd

Apply suggestion from @Copilot

616de61

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Apply suggestion from @Copilot

b572ee3

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Apply suggestions from code review

214a58c

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

rkazants commented Feb 19, 2026

View reviewed changes

skills/adding-new-model-support/SKILL.md Outdated Show resolved Hide resolved

rkazants and others added 3 commits February 19, 2026 16:36

Apply suggestion from @rkazants

afbe096

Add quantization test requirement.

b98c013

Merge pull request #1 from popovaan/skill_md_update

b891b1a

Add quantization test requirement in SKILL.md.

Copilot AI mentioned this pull request Mar 8, 2026

Add Qwen3.5 model support (VLM + hybrid GatedDeltaNet text model) rkazants/optimum-intel#3

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[OpenVINO] Add agentic skill for adding new model support#1616

[OpenVINO] Add agentic skill for adding new model support#1616
rkazants wants to merge 13 commits intohuggingface:mainfrom
rkazants:agentic_model_adding

rkazants commented Feb 18, 2026

Uh oh!

HuggingFaceDocBuilderDev commented Feb 18, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

SearchSavior commented Mar 25, 2026

Uh oh!

as-suvorov commented Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

rkazants commented Feb 18, 2026

What does this PR do?

Before submitting

Uh oh!

HuggingFaceDocBuilderDev commented Feb 18, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

SearchSavior commented Mar 25, 2026

Uh oh!

as-suvorov commented Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants