[OpenVINO] support ai-sage/GigaChat3-10B-A1.8B-bf16 by Mohamed-Ashraf273 · Pull Request #1626 · huggingface/optimum-intel

Mohamed-Ashraf273 · 2026-02-28T11:27:18Z

What does this PR do?

Conversion cmd-line for ai-sage/GigaChat3-10B-A1.8B-bf16:

optimum-cli export openvino -m ai-sage/GigaChat3-10B-A1.8B-bf16 ./output_dir --task text-generation-with-past

Inference of ai-sage/GigaChat3-10B-A1.8B-bf16 using OpenVINO backend:

import torch
from transformers import AutoTokenizer
from optimum.intel.openvino import OVModelForCausalLM

model_dir="output_dir"

tokenizer = AutoTokenizer.from_pretrained(model_dir)
model = OVModelForCausalLM.from_pretrained(model_dir)

# Prepare input
prompt = "What is the capital of France?"
inputs = tokenizer(prompt, return_tensors="pt")

# Run inference
output_ids = model.generate(**inputs, max_new_tokens=10)
output_text = tokenizer.decode(output_ids[0])

print(output_text)

Solving Issue: #1608

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

Mohamed-Ashraf273 · 2026-03-03T13:21:43Z

Hi @popovaan ,
Can you take a look?
Thanks!

tests/openvino/test_decoder.py

popovaan · 2026-03-03T14:12:37Z

Thanks for the PR! Please add tests for this model. For now, use a locally generated tiny model. I'm currently investigating whether we're allowed to invite GSoC contributors to the optimum-intel-internal-testing group so that you can publish the model there. If not, I’ll publish it myself and share the link.

Mohamed-Ashraf273 · 2026-03-03T16:59:53Z

Thanks for the PR! Please add tests for this model. For now, use a locally generated tiny model. I'm currently investigating whether we're allowed to invite GSoC contributors to the optimum-intel-internal-testing group so that you can publish the model there. If not, I’ll publish it myself and share the link.

Got it, thanks!
I’ll add the tests with a locally generated tiny model

Mohamed-Ashraf273 · 2026-03-03T21:20:18Z

Thanks for the PR! Please add tests for this model. For now, use a locally generated tiny model. I'm currently investigating whether we're allowed to invite GSoC contributors to the optimum-intel-internal-testing group so that you can publish the model there. If not, I’ll publish it myself and share the link.

Hi @popovaan, @rkazants,
I've added a tiny model along with the tests. Could you please take a look?
Thanks!

rkazants

please also add export tests. The same test set that you have added for the previuos model.
Update documentation.

Copilot

Pull request overview

This PR aims to add OpenVINO export/inference support coverage for the ai-sage/GigaChat3-10B-A1.8B-bf16 family by extending OpenVINO test fixtures and adjusting DeepSeek patching logic used during export.

Changes:

Add a gigachat3 tiny-random model fixture and include it in OpenVINO decoder integration coverage.
Update decoder tests for gigachat3 (expected SDPA count, relaxed logits tolerance, and skip conditions for incompatible Transformers versions).
Refactor DeepSeek attention patching to use a versioned factory function and extend MoE patching to handle MLP blocks exposing experts but not moe_infer.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.

File	Description
`tests/openvino/utils_tests.py`	Adds the `gigachat3` test model mapping; adjusts which models are treated as remote-code in tests.
`tests/openvino/test_decoder.py`	Adds `gigachat3` to tested architectures and config expectations; tweaks tolerance/skip logic; adds debug output.
`optimum/exporters/openvino/model_patcher.py`	Updates DeepSeek patcher to use a unified attention forward factory and broadens MoE patching behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

tests/openvino/utils_tests.py

tests/openvino/test_decoder.py

optimum/exporters/openvino/model_patcher.py

Mohamed-Ashraf273 · 2026-03-04T09:20:53Z

please also add export tests. The same test set that you have added for the previuos model. Update documentation.

@rkazants
Thanks for your feedback!
I've added export tests and updated documentation for the newly added model!

Mohamed-Ashraf273 · 2026-03-04T12:47:13Z

Thanks for the PR! Please add tests for this model. For now, use a locally generated tiny model. I'm currently investigating whether we're allowed to invite GSoC contributors to the optimum-intel-internal-testing group so that you can publish the model there. If not, I’ll publish it myself and share the link.

Hi @popovaan,

I’ve finished adding the tests and temporarily published tiny-random-gigachat3 on my Hugging Face profile (mohamedashraf273/tiny-random-gigachat3) until it can be moved to optimum-intel-internal-testing.

Would it be possible to invite me to the group so I can publish it there, or would you prefer to handle the publishing?

Please let me know if any changes are needed.
Thanks!

savvadesogle · 2026-03-04T22:52:35Z

Hi. Can I help test the model?
You are so great, thank you so much!♥️🔥😊

Mohamed-Ashraf273 · 2026-03-05T01:44:26Z

Hi. Can I help test the model?
You are so great, thank you so much!♥️🔥😊

Hi!
That would be great, thank you so much!
Please feel free to test it and let me know if you encounter any issues or unexpected behavior.

savvadesogle · 2026-03-24T13:49:29Z

https://huggingface.co/ai-sage/GigaChat3.1-10B-A1.8B-bf16

I hope they didn't change the architecture.🥺

update
💪

optimum/exporters/openvino/model_patcher.py

Mohamed-Ashraf273 · 2026-03-24T21:27:05Z

Hi @rkazants

I’ve completed all the requested revisions. Could you please take a look?
Let me know if there’s anything else I should update or improve.
Thanks!

rkazants · 2026-03-25T05:27:44Z

@Mohamed-Ashraf273, please confirm that you have tested a real model (not tiny one) and the generation results match to the reference. I am asking it because I am not sure if past_key)values are updated,

rkazants · 2026-03-25T05:46:41Z

@Mohamed-Ashraf273, btw just yesterday new Gigachat 3.1 released: ai-sage/GigaChat3.1-10B-A1.8B-bf16
Can you please check this on your PR? If something is needed, I will create separate good first issue to cover it in separate PR.

Mohamed-Ashraf273 · 2026-03-25T14:01:26Z

@Mohamed-Ashraf273, please confirm that you have tested a real model (not tiny one) and the generation results match to the reference. I am asking it because I am not sure if past_key)values are updated,

@rkazants
Yeah, you're right! I'm currently working on a few issues. Thanks for the feedback!

Mohamed-Ashraf273 · 2026-03-25T14:03:09Z

@Mohamed-Ashraf273, btw just yesterday new Gigachat 3.1 released: ai-sage/GigaChat3.1-10B-A1.8B-bf16 Can you please check this on your PR? If something is needed, I will create separate good first issue to cover it in separate PR.

Got it, gonna check it after solving the issues. Thanks!

Mohamed-Ashraf273 · 2026-03-26T19:41:00Z

@rkazants , @popovaan
For model=ai-sage/GigaChat3-10B-A1.8B-bf16
I've fixed some issues in the model that were causing low similarity <=0.1 after reverting refactoring.
Now similarity for full test (27 samples) is 0.9218938
However, using chat-template gives a higher similarity of 0.97.

and for fast-test (4 samples): 0.9480462.

Test script:

import argparse
from importlib.resources import files
from pathlib import Path

import torch
import whowhatbench
import yaml
from optimum.intel.openvino import OVModelForCausalLM
from transformers import AutoModelForCausalLM, AutoTokenizer


DEFAULT_MODEL_ID = "ai-sage/GigaChat3-10B-A1.8B-bf16"
DEFAULT_MODEL_DIR = "./output_dir"
DEFAULT_MAX_NEW_TOKENS = 128
FAST_PROMPTS = [
    "Who is the most famous programmer?",
    "Who is Leo Tolstoy?",
    "Explain what artificial intelligence is.",
    "What is deep learning?",
]


def load_full_prompts():
    prompt_path = files("whowhatbench.prompts").joinpath("text_prompts.yaml")
    prompt_data = yaml.safe_load(prompt_path.read_text(encoding="utf-8"))
    return prompt_data["en"]["prompts"]


def build_generation_kwargs(tokenizer, max_new_tokens: int):
    eos_token_id = tokenizer.eos_token_id
    pad_token_id = tokenizer.pad_token_id if tokenizer.pad_token_id is not None else eos_token_id
    bos_token_id = tokenizer.bos_token_id

    generation_kwargs = {
        "do_sample": False,
        "num_beams": 1,
        "max_new_tokens": max_new_tokens,
        "use_cache": True,
        "use_model_defaults": False,
    }

    if eos_token_id is not None:
        generation_kwargs["eos_token_id"] = eos_token_id
    if pad_token_id is not None:
        generation_kwargs["pad_token_id"] = pad_token_id
    if bos_token_id is not None:
        generation_kwargs["bos_token_id"] = bos_token_id

    return generation_kwargs


def prepare_inputs(model, tokenizer, prompt: str):
    device = getattr(model, "device", "cpu")
    inputs = tokenizer(prompt, return_tensors="pt").to(device)
    inputs.pop("token_type_ids", None)
    return inputs


def generate_answer(
    model,
    tokenizer,
    prompt,
    max_new_tokens,
    crop_question,
    use_chat_template=False,
    empty_adapters=False,
    num_assistant_tokens=0,
    assistant_confidence_threshold=0.0,
):
    del crop_question, use_chat_template, empty_adapters, num_assistant_tokens, assistant_confidence_threshold
    inputs = prepare_inputs(model, tokenizer, prompt)
    tokens = model.generate(**inputs, **build_generation_kwargs(tokenizer, max_new_tokens))
    prompt_len = inputs["input_ids"].shape[-1]
    return tokenizer.decode(tokens[0, prompt_len:], skip_special_tokens=True)


def load_models(model_id: str, model_dir: str):
    model_dir_path = Path(model_dir).expanduser().resolve()
    if not model_dir_path.exists():
        raise FileNotFoundError(
            f"OpenVINO model directory was not found: {model_dir_path}. "
            "Run `python demo.py export` first or pass the correct exported model path."
        )

    tokenizer = AutoTokenizer.from_pretrained(model_id)
    base_model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16)
    optimized_model = OVModelForCausalLM.from_pretrained(
        str(model_dir_path),
        use_cache=True,
        load_in_8bit=False,
        quantization_config=None,
    )
    return tokenizer, base_model, optimized_model


def export_model(model_id: str, model_dir: str):
    model_dir_path = Path(model_dir).expanduser().resolve()
    model_dir_path.mkdir(parents=True, exist_ok=True)

    optimized_model = OVModelForCausalLM.from_pretrained(
        model_id,
        export=True,
        use_cache=True,
        load_in_8bit=False,
        quantization_config=None,
    )

    optimized_model.save_pretrained(str(model_dir_path))
    print(f"Saved OpenVINO model to {model_dir_path}")


def run_test(model_id: str, model_dir: str, prompts, max_new_tokens: int, top_k: int):
    tokenizer, base_model, optimized_model = load_models(model_id, model_dir)

    evaluator = whowhatbench.TextEvaluator(
        base_model=base_model,
        tokenizer=tokenizer,
        test_data=prompts,
        max_new_tokens=max_new_tokens,
        use_chat_template=False,
        gen_answer_fn=generate_answer,
    )
    _, metrics = evaluator.score(optimized_model, gen_answer_fn=generate_answer)

    print("similarity:", metrics["similarity"][0])
    print()
    print("Worst examples:")
    for example in evaluator.worst_examples(top_k=top_k, metric="similarity"):
        print("=========================")
        print("Prompt:", example["prompt"])
        print("Baseline:", example["source_model"])
        print("Optimized:", example["optimized_model"])
        print()


def build_parser():
    parser = argparse.ArgumentParser(description="Simple GigaChat3 OpenVINO demo.")
    subparsers = parser.add_subparsers(dest="command", required=True)

    export_parser = subparsers.add_parser("export")
    export_parser.add_argument("--model-id", default=DEFAULT_MODEL_ID)
    export_parser.add_argument("--model-dir", default=DEFAULT_MODEL_DIR)

    test_parser = subparsers.add_parser("test")
    test_parser.add_argument("--model-id", default=DEFAULT_MODEL_ID)
    test_parser.add_argument("--model-dir", default=DEFAULT_MODEL_DIR)
    test_parser.add_argument("--max-new-tokens", type=int, default=DEFAULT_MAX_NEW_TOKENS)
    test_parser.add_argument("--top-k", type=int, default=5)

    fast_test_parser = subparsers.add_parser("fast-test")
    fast_test_parser.add_argument("--model-id", default=DEFAULT_MODEL_ID)
    fast_test_parser.add_argument("--model-dir", default=DEFAULT_MODEL_DIR)
    fast_test_parser.add_argument("--max-new-tokens", type=int, default=DEFAULT_MAX_NEW_TOKENS)
    fast_test_parser.add_argument("--top-k", type=int, default=5)

    return parser


def main():
    args = build_parser().parse_args()

    if args.command == "export":
        export_model(args.model_id, args.model_dir)
        return

    if args.command == "test":
        run_test(
            model_id=args.model_id,
            model_dir=args.model_dir,
            prompts=load_full_prompts(),
            max_new_tokens=args.max_new_tokens,
            top_k=args.top_k,
        )
        return

    run_test(
        model_id=args.model_id,
        model_dir=args.model_dir,
        prompts=FAST_PROMPTS,
        max_new_tokens=args.max_new_tokens,
        top_k=args.top_k,
    )


if __name__ == "__main__":
    main()

How reproduce tests:

python demo.py export
python demo.py test
python demo.py fast-test

Mohamed-Ashraf273 · 2026-03-26T19:45:19Z

@Mohamed-Ashraf273, btw just yesterday new Gigachat 3.1 released: ai-sage/GigaChat3.1-10B-A1.8B-bf16 Can you please check this on your PR? If something is needed, I will create separate good first issue to cover it in separate PR.

@rkazants,
I checked ai-sage/GigaChat3.1-10B-A1.8B-bf16, and it exported successfully.
However, I ran out of resources, so I’m unable to run a full similarity test at the moment.

Mohamed-Ashraf273 · 2026-03-27T11:58:17Z

@rkazants , @popovaan
Also, here are tests passing using transformers==4.57.6

(env) mohamed-ashraf@mohamed-ashraf-LOQ-15IRX9:~/Desktop/projects/GSoC26/optimum-intel$ python -m pytest tests/openvino/*.py -xvs -k "deepseek or gigachat" 2>&1
============================================================================= test session starts =============================================================================
platform linux -- Python 3.12.3, pytest-7.4.4, pluggy-1.6.0 -- /home/mohamed-ashraf/Desktop/projects/env/bin/python
cachedir: .pytest_cache
rootdir: /home/mohamed-ashraf/Desktop/projects/GSoC26/optimum-intel
configfile: pyproject.toml
plugins: anyio-4.12.1, mock-3.15.1, langsmith-0.7.10
collecting ... Multiple distributions found for package optimum. Picked distribution: optimum-onnx
collecting 693 items                                                                                                                                                          The provided dataset won't have any effect on the resulting compressed model because no data-aware quantization algorithm is selected and compression ratio is 1.0.
The provided dataset won't have any effect on the resulting compressed model because no data-aware quantization algorithm is selected and compression ratio is 1.0.
collected 1279 items / 1268 deselected / 11 selected                                                                                                                          

tests/openvino/test_decoder.py::OVModelForCausalLMIntegrationTest::test_beam_search_60_deepseek SKIPPED (test is slow)
tests/openvino/test_decoder.py::OVModelForCausalLMIntegrationTest::test_beam_search_61_gigachat3 SKIPPED (test is slow)
tests/openvino/test_decoder.py::OVModelForCausalLMIntegrationTest::test_compare_to_transformers_60_deepseek `torch_dtype` is deprecated! Use `dtype` instead!
'(ReadTimeoutError("HTTPSConnectionPool(host='huggingface.co', port=443): Read timed out. (read timeout=10)"), '(Request ID: 26952585-b835-4367-94b0-a369117016bd)')' thrown while requesting HEAD https://huggingface.co/optimum-intel-internal-testing/tiny-random-deepseek-v3/resolve/main/video_preprocessor_config.json
Retrying in 1s [Retry 1/5].
'(ReadTimeoutError("HTTPSConnectionPool(host='huggingface.co', port=443): Read timed out. (read timeout=10)"), '(Request ID: 8b3c14ad-57bb-4dfb-bfe5-980ed6358d15)')' thrown while requesting HEAD https://huggingface.co/optimum-intel-internal-testing/tiny-random-deepseek-v3/resolve/main/video_preprocessor_config.json
Retrying in 2s [Retry 2/5].
This model works best with OpenVINO 2026.1 or later. Earlier versions require float() conversion for MoE weights, which may affect performance. OpenVINO 2026.1 includes a fix for torch.bmm dtype handling.
`loss_type=None` was set in the config but it is unrecognized. Using the default loss: `ForCausalLMLoss`.
`generation_config` default values have been modified to match model-specific defaults: {'temperature': 0.3, 'top_p': 0.95, 'bos_token_id': 100000}. If this is not desired, please set these values explicitly.
The following generation flags are not valid and may be ignored: ['temperature', 'top_p']. Set `TRANSFORMERS_VERBOSITY=info` for more details.
PASSED
tests/openvino/test_decoder.py::OVModelForCausalLMIntegrationTest::test_compare_to_transformers_61_gigachat3 '(ReadTimeoutError("HTTPSConnectionPool(host='huggingface.co', port=443): Read timed out. (read timeout=10)"), '(Request ID: 6ffef073-4ec6-45d6-aefb-0bf51b51756f)')' thrown while requesting HEAD https://huggingface.co/optimum-intel-internal-testing/tiny-random-gigachat3/resolve/main/config.json
Retrying in 1s [Retry 1/5].
This model works best with OpenVINO 2026.1 or later. Earlier versions require float() conversion for MoE weights, which may affect performance. OpenVINO 2026.1 includes a fix for torch.bmm dtype handling.
`generation_config` default values have been modified to match model-specific defaults: {'bos_token_id': 1}. If this is not desired, please set these values explicitly.
PASSED
tests/openvino/test_decoder.py::OVModelForCausalLMIntegrationTest::test_pipeline_60_deepseek SKIPPED (test is slow)
tests/openvino/test_decoder.py::OVModelForCausalLMIntegrationTest::test_pipeline_61_gigachat3 SKIPPED (test is slow)
tests/openvino/test_export.py::ExportModelTest::test_export_26_deepseek This model works best with OpenVINO 2026.1 or later. Earlier versions require float() conversion for MoE weights, which may affect performance. OpenVINO 2026.1 includes a fix for torch.bmm dtype handling.
This model works best with OpenVINO 2026.1 or later. Earlier versions require float() conversion for MoE weights, which may affect performance. OpenVINO 2026.1 includes a fix for torch.bmm dtype handling.
PASSED
tests/openvino/test_export.py::ExportModelTest::test_export_27_gigachat3 This model works best with OpenVINO 2026.1 or later. Earlier versions require float() conversion for MoE weights, which may affect performance. OpenVINO 2026.1 includes a fix for torch.bmm dtype handling.
This model works best with OpenVINO 2026.1 or later. Earlier versions require float() conversion for MoE weights, which may affect performance. OpenVINO 2026.1 includes a fix for torch.bmm dtype handling.
PASSED
tests/openvino/test_quantization.py::OVQuantizationConfigTest::test_named_default_configurations_49_deepseek_ai_DeepSeek_R1_Distill_Qwen_1_5B PASSED
tests/openvino/test_quantization.py::OVQuantizationConfigTest::test_named_default_configurations_50_deepseek_ai_DeepSeek_R1_Distill_Qwen_7B PASSED
tests/openvino/test_quantization.py::OVQuantizationConfigTest::test_named_default_configurations_51_deepseek_ai_DeepSeek_R1_Distill_Llama_8B PASSED

================================================================================= warnings summary ==================================================================================
../../../../../../usr/lib/python3.12/multiprocessing/popen_fork.py:66: 1 warning
tests/openvino/test_decoder.py: 118 warnings
tests/openvino/test_export.py: 234 warnings
  /usr/lib/python3.12/multiprocessing/popen_fork.py:66: DeprecationWarning: This process (pid=471547) is multi-threaded, use of fork() may lead to deadlocks in the child.
    self.pid = os.fork()

<frozen importlib._bootstrap>:488
  <frozen importlib._bootstrap>:488: DeprecationWarning: builtin type SwigPyPacked has no __module__ attribute

<frozen importlib._bootstrap>:488
  <frozen importlib._bootstrap>:488: DeprecationWarning: builtin type SwigPyObject has no __module__ attribute

../../env/lib/python3.12/site-packages/torch/jit/_script.py:1480
  /home/mohamed-ashraf/Desktop/projects/env/lib/python3.12/site-packages/torch/jit/_script.py:1480: DeprecationWarning: `torch.jit.script` is deprecated. Please switch to `torch.compile` or `torch.export`.
    warnings.warn(

tests/openvino/test_decoder.py::OVModelForCausalLMIntegrationTest::test_compare_to_transformers_60_deepseek
tests/openvino/test_decoder.py::OVModelForCausalLMIntegrationTest::test_compare_to_transformers_61_gigachat3
tests/openvino/test_export.py::ExportModelTest::test_export_26_deepseek
tests/openvino/test_export.py::ExportModelTest::test_export_26_deepseek
tests/openvino/test_export.py::ExportModelTest::test_export_27_gigachat3
tests/openvino/test_export.py::ExportModelTest::test_export_27_gigachat3
  /home/mohamed-ashraf/Desktop/projects/env/lib/python3.12/site-packages/torch/jit/_trace.py:1000: DeprecationWarning: `torch.jit.trace` is deprecated. Please switch to `torch.compile` or `torch.export`.
    warnings.warn(

tests/openvino/test_decoder.py::OVModelForCausalLMIntegrationTest::test_compare_to_transformers_60_deepseek
tests/openvino/test_decoder.py::OVModelForCausalLMIntegrationTest::test_compare_to_transformers_61_gigachat3
tests/openvino/test_export.py::ExportModelTest::test_export_26_deepseek
tests/openvino/test_export.py::ExportModelTest::test_export_26_deepseek
tests/openvino/test_export.py::ExportModelTest::test_export_27_gigachat3
tests/openvino/test_export.py::ExportModelTest::test_export_27_gigachat3
  /home/mohamed-ashraf/Desktop/projects/env/lib/python3.12/site-packages/torch/jit/_trace.py:1139: DeprecationWarning: `torch.jit.trace_method` is deprecated. Please switch to `torch.compile` or `torch.export`.
    warnings.warn(

tests/openvino/test_decoder.py::OVModelForCausalLMIntegrationTest::test_compare_to_transformers_60_deepseek
tests/openvino/test_decoder.py::OVModelForCausalLMIntegrationTest::test_compare_to_transformers_61_gigachat3
tests/openvino/test_export.py::ExportModelTest::test_export_26_deepseek
tests/openvino/test_export.py::ExportModelTest::test_export_27_gigachat3
  /home/mohamed-ashraf/Desktop/projects/env/lib/python3.12/site-packages/transformers/cache_utils.py:132: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
    if not self.is_initialized or self.keys.numel() == 0:

tests/openvino/test_decoder.py::OVModelForCausalLMIntegrationTest::test_compare_to_transformers_60_deepseek
tests/openvino/test_decoder.py::OVModelForCausalLMIntegrationTest::test_compare_to_transformers_61_gigachat3
tests/openvino/test_export.py::ExportModelTest::test_export_26_deepseek
tests/openvino/test_export.py::ExportModelTest::test_export_27_gigachat3
  /home/mohamed-ashraf/Desktop/projects/env/lib/python3.12/site-packages/transformers/masking_utils.py:207: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
    if (padding_length := kv_length + kv_offset - attention_mask.shape[-1]) > 0:

tests/openvino/test_decoder.py::OVModelForCausalLMIntegrationTest::test_compare_to_transformers_60_deepseek
tests/openvino/test_decoder.py::OVModelForCausalLMIntegrationTest::test_compare_to_transformers_61_gigachat3
tests/openvino/test_export.py::ExportModelTest::test_export_26_deepseek
tests/openvino/test_export.py::ExportModelTest::test_export_27_gigachat3
  /home/mohamed-ashraf/Desktop/projects/GSoC26/optimum-intel/optimum/exporters/openvino/model_patcher.py:240: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
    torch.tensor(0.0, device=mask.device, dtype=dtype),

tests/openvino/test_decoder.py::OVModelForCausalLMIntegrationTest::test_compare_to_transformers_60_deepseek
tests/openvino/test_decoder.py::OVModelForCausalLMIntegrationTest::test_compare_to_transformers_61_gigachat3
tests/openvino/test_export.py::ExportModelTest::test_export_26_deepseek
tests/openvino/test_export.py::ExportModelTest::test_export_27_gigachat3
  /home/mohamed-ashraf/Desktop/projects/GSoC26/optimum-intel/optimum/exporters/openvino/model_patcher.py:241: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
    torch.tensor(torch.finfo(torch.float16).min, device=mask.device, dtype=dtype),

tests/openvino/test_decoder.py: 58 warnings
  /home/mohamed-ashraf/Desktop/projects/GSoC26/optimum-intel/optimum/intel/openvino/modeling_decoder.py:820: DeprecationWarning: __array__ implementation doesn't accept a copy keyword, so passing copy=False failed. __array__ must implement 'dtype' and 'copy' keyword arguments. To learn more, see the migration guide https://numpy.org/devdocs/numpy_2_0_migration_guide.html#adapting-to-changes-in-the-copy-keyword
    np.array(beam_idx) if not self._second_iter_beam_search else self.next_beam_idx

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
====================================================== 7 passed, 4 skipped, 1268 deselected, 442 warnings in 139.54s (0:02:19) ======================================================
(env) mohamed-ashraf@mohamed-ashraf-LOQ-15IRX9:~/Desktop/projects/GSoC26/optimum-intel$

Format test:

(env) mohamed-ashraf@mohamed-ashraf-LOQ-15IRX9:~/Desktop/projects/GSoC26/optimum-intel$ ruff check optimum/ --config pyproject.toml
warning: The top-level linter settings are deprecated in favour of their counterparts in the `lint` section. Please update the following options in `pyproject.toml`:
  - 'ignore' -> 'lint.ignore'
  - 'select' -> 'lint.select'
  - 'isort' -> 'lint.isort'
  - 'per-file-ignores' -> 'lint.per-file-ignores'
All checks passed!
(env) mohamed-ashraf@mohamed-ashraf-LOQ-15IRX9:~/Desktop/projects/GSoC26/optimum-intel$

savvadesogle · 2026-03-28T08:19:09Z

Hello. Could you please tell me at what point can I start converting a model and not update the converted model anymore? At what point is everything considered secure?
I mean, after what point is everything definitely working?
Or do I need to wait until it's merged with the main branch?

Sorry for the inconvenience.

Mohamed-Ashraf273 · 2026-03-28T12:44:41Z

@savvadesogle
It should be stable to use now.
I’ve validated it with tests and shared the results above, and everything is working as expected.
I don’t anticipate further changes unless something is flagged during review.

Mohamed-Ashraf273 · 2026-04-03T09:39:54Z

Hi @rkazants , @popovaan

I’ve completed all the requested revisions. Could you please take a look?
Let me know if there’s anything else I should update or improve.
Thanks!

Mohamed-Ashraf273 changed the title ~~modify patcher~~ [OpenVINO] support gigachat3 Feb 28, 2026

add support gigachat3

f88c6a8

Mohamed-Ashraf273 force-pushed the support_gigachat3 branch from d376bab to f88c6a8 Compare March 3, 2026 03:15

support gigacgat3

c26ffe8

Mohamed-Ashraf273 force-pushed the support_gigachat3 branch from 949c2b6 to c26ffe8 Compare March 3, 2026 12:55

Mohamed-Ashraf273 marked this pull request as ready for review March 3, 2026 13:19

Mohamed-Ashraf273 changed the title ~~[OpenVINO] support gigachat3~~ [OpenVINO] support ai-sage/GigaChat3-10B-A1.8B-bf16 Mar 3, 2026

Mohamed-Ashraf273 commented Mar 3, 2026

View reviewed changes

tests/openvino/test_decoder.py Outdated Show resolved Hide resolved

add tests & create tiny model

751cd02

rkazants requested review from IlyasMoutawwakil, Copilot, echarlaix, popovaan and rkazants March 4, 2026 06:36

Copilot started reviewing on behalf of rkazants March 4, 2026 06:36 View session

rkazants reviewed Mar 4, 2026

View reviewed changes

Copilot AI reviewed Mar 4, 2026

View reviewed changes

Mohamed-Ashraf273 added 2 commits March 4, 2026 11:15

add tests and fix issues

0d16b4f

fix version skip test

f2a1e53

Mohamed-Ashraf273 added 3 commits March 4, 2026 12:13

add docs & modify patcher

aac19fb

modify patcher

5c134eb

modify patcher

f231bca

Mohamed-Ashraf273 requested review from popovaan and rkazants March 24, 2026 14:02

rkazants reviewed Mar 24, 2026

View reviewed changes

optimum/exporters/openvino/model_patcher.py Outdated Show resolved Hide resolved

rkazants reviewed Mar 24, 2026

View reviewed changes

optimum/exporters/openvino/model_patcher.py Show resolved Hide resolved

rkazants approved these changes Mar 24, 2026

View reviewed changes

modify based on review

4174478

This comment was marked as resolved.

Sign in to view

rkazants approved these changes Mar 25, 2026

View reviewed changes

Mohamed-Ashraf273 added 6 commits March 26, 2026 02:39

fix issues

9ae1162

fix issues

5dbc1c8

fix issues

f7043c7

fix issues

2877a8e

fix issues

2c2d31b

fix issues

5cb2d8b

Mohamed-Ashraf273 and others added 3 commits March 30, 2026 16:27

Merge branch 'main' into support_gigachat3

f1b92ed

Remove Flaubert and add GigaChat3 to models list

39e770c

update docs

aec4a90

Mohamed-Ashraf273 requested a review from rkazants April 1, 2026 10:00

Conversation

Mohamed-Ashraf273 commented Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Uh oh!

Mohamed-Ashraf273 commented Mar 3, 2026

Uh oh!

Uh oh!

popovaan commented Mar 3, 2026

Uh oh!

Mohamed-Ashraf273 commented Mar 3, 2026

Uh oh!

Mohamed-Ashraf273 commented Mar 3, 2026

Uh oh!

rkazants left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Mohamed-Ashraf273 commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Mohamed-Ashraf273 commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

savvadesogle commented Mar 4, 2026

Uh oh!

Mohamed-Ashraf273 commented Mar 5, 2026

Uh oh!

savvadesogle commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Mohamed-Ashraf273 commented Mar 24, 2026

Uh oh!

rkazants commented Mar 25, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

rkazants commented Mar 25, 2026

Uh oh!

Mohamed-Ashraf273 commented Mar 25, 2026

Uh oh!

Mohamed-Ashraf273 commented Mar 25, 2026

Uh oh!

Mohamed-Ashraf273 commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Mohamed-Ashraf273 commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Mohamed-Ashraf273 commented Mar 27, 2026

Uh oh!

savvadesogle commented Mar 28, 2026

Uh oh!

Mohamed-Ashraf273 commented Mar 28, 2026

Uh oh!

Mohamed-Ashraf273 commented Apr 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

Mohamed-Ashraf273 commented Feb 28, 2026 •

edited

Loading

Mohamed-Ashraf273 commented Mar 4, 2026 •

edited

Loading

Mohamed-Ashraf273 commented Mar 4, 2026 •

edited

Loading

savvadesogle commented Mar 24, 2026 •

edited

Loading

Mohamed-Ashraf273 commented Mar 26, 2026 •

edited

Loading

Mohamed-Ashraf273 commented Mar 26, 2026 •

edited

Loading