Skip to content

[OpenVINO] support ai-sage/GigaChat3-10B-A1.8B-bf16#1626

Open
Mohamed-Ashraf273 wants to merge 41 commits intohuggingface:mainfrom
Mohamed-Ashraf273:support_gigachat3
Open

[OpenVINO] support ai-sage/GigaChat3-10B-A1.8B-bf16#1626
Mohamed-Ashraf273 wants to merge 41 commits intohuggingface:mainfrom
Mohamed-Ashraf273:support_gigachat3

Conversation

@Mohamed-Ashraf273
Copy link
Copy Markdown
Contributor

@Mohamed-Ashraf273 Mohamed-Ashraf273 commented Feb 28, 2026

What does this PR do?

Conversion cmd-line for ai-sage/GigaChat3-10B-A1.8B-bf16:

optimum-cli export openvino -m ai-sage/GigaChat3-10B-A1.8B-bf16 ./output_dir --task text-generation-with-past

Inference of ai-sage/GigaChat3-10B-A1.8B-bf16 using OpenVINO backend:

import torch
from transformers import AutoTokenizer
from optimum.intel.openvino import OVModelForCausalLM

model_dir="output_dir"

tokenizer = AutoTokenizer.from_pretrained(model_dir)
model = OVModelForCausalLM.from_pretrained(model_dir)

# Prepare input
prompt = "What is the capital of France?"
inputs = tokenizer(prompt, return_tensors="pt")

# Run inference
output_ids = model.generate(**inputs, max_new_tokens=10)
output_text = tokenizer.decode(output_ids[0])

print(output_text)

Solving Issue: #1608

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

@Mohamed-Ashraf273 Mohamed-Ashraf273 changed the title modify patcher [OpenVINO] support gigachat3 Feb 28, 2026
@Mohamed-Ashraf273 Mohamed-Ashraf273 marked this pull request as ready for review March 3, 2026 13:19
@Mohamed-Ashraf273
Copy link
Copy Markdown
Contributor Author

Hi @popovaan ,
Can you take a look?
Thanks!

@Mohamed-Ashraf273 Mohamed-Ashraf273 changed the title [OpenVINO] support gigachat3 [OpenVINO] support ai-sage/GigaChat3-10B-A1.8B-bf16 Mar 3, 2026
@popovaan
Copy link
Copy Markdown
Collaborator

popovaan commented Mar 3, 2026

Thanks for the PR! Please add tests for this model. For now, use a locally generated tiny model. I'm currently investigating whether we're allowed to invite GSoC contributors to the optimum-intel-internal-testing group so that you can publish the model there. If not, I’ll publish it myself and share the link.

@Mohamed-Ashraf273
Copy link
Copy Markdown
Contributor Author

Thanks for the PR! Please add tests for this model. For now, use a locally generated tiny model. I'm currently investigating whether we're allowed to invite GSoC contributors to the optimum-intel-internal-testing group so that you can publish the model there. If not, I’ll publish it myself and share the link.

Got it, thanks!
I’ll add the tests with a locally generated tiny model

@Mohamed-Ashraf273
Copy link
Copy Markdown
Contributor Author

Thanks for the PR! Please add tests for this model. For now, use a locally generated tiny model. I'm currently investigating whether we're allowed to invite GSoC contributors to the optimum-intel-internal-testing group so that you can publish the model there. If not, I’ll publish it myself and share the link.

Hi @popovaan, @rkazants,
I've added a tiny model along with the tests. Could you please take a look?
Thanks!

Copy link
Copy Markdown
Collaborator

@rkazants rkazants left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please also add export tests. The same test set that you have added for the previuos model.
Update documentation.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to add OpenVINO export/inference support coverage for the ai-sage/GigaChat3-10B-A1.8B-bf16 family by extending OpenVINO test fixtures and adjusting DeepSeek patching logic used during export.

Changes:

  • Add a gigachat3 tiny-random model fixture and include it in OpenVINO decoder integration coverage.
  • Update decoder tests for gigachat3 (expected SDPA count, relaxed logits tolerance, and skip conditions for incompatible Transformers versions).
  • Refactor DeepSeek attention patching to use a versioned factory function and extend MoE patching to handle MLP blocks exposing experts but not moe_infer.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.

File Description
tests/openvino/utils_tests.py Adds the gigachat3 test model mapping; adjusts which models are treated as remote-code in tests.
tests/openvino/test_decoder.py Adds gigachat3 to tested architectures and config expectations; tweaks tolerance/skip logic; adds debug output.
optimum/exporters/openvino/model_patcher.py Updates DeepSeek patcher to use a unified attention forward factory and broadens MoE patching behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@Mohamed-Ashraf273
Copy link
Copy Markdown
Contributor Author

Mohamed-Ashraf273 commented Mar 4, 2026

please also add export tests. The same test set that you have added for the previuos model. Update documentation.

@rkazants
Thanks for your feedback!
I've added export tests and updated documentation for the newly added model!

@Mohamed-Ashraf273
Copy link
Copy Markdown
Contributor Author

Mohamed-Ashraf273 commented Mar 4, 2026

Thanks for the PR! Please add tests for this model. For now, use a locally generated tiny model. I'm currently investigating whether we're allowed to invite GSoC contributors to the optimum-intel-internal-testing group so that you can publish the model there. If not, I’ll publish it myself and share the link.

Hi @popovaan,

I’ve finished adding the tests and temporarily published tiny-random-gigachat3 on my Hugging Face profile (mohamedashraf273/tiny-random-gigachat3) until it can be moved to optimum-intel-internal-testing.

Would it be possible to invite me to the group so I can publish it there, or would you prefer to handle the publishing?

Please let me know if any changes are needed.
Thanks!

@savvadesogle
Copy link
Copy Markdown

Hi. Can I help test the model?
You are so great, thank you so much!♥️🔥😊

@Mohamed-Ashraf273
Copy link
Copy Markdown
Contributor Author

Hi. Can I help test the model?
You are so great, thank you so much!♥️🔥😊

Hi!
That would be great, thank you so much!
Please feel free to test it and let me know if you encounter any issues or unexpected behavior.

@savvadesogle
Copy link
Copy Markdown

savvadesogle commented Mar 24, 2026

https://huggingface.co/ai-sage/GigaChat3.1-10B-A1.8B-bf16

I hope they didn't change the architecture.🥺

update
💪
изображение

@Mohamed-Ashraf273
Copy link
Copy Markdown
Contributor Author

Hi @rkazants

I’ve completed all the requested revisions. Could you please take a look?
Let me know if there’s anything else I should update or improve.
Thanks!

@rkazants
Copy link
Copy Markdown
Collaborator

@Mohamed-Ashraf273, please confirm that you have tested a real model (not tiny one) and the generation results match to the reference. I am asking it because I am not sure if past_key)values are updated,

rkazants

This comment was marked as resolved.

@rkazants
Copy link
Copy Markdown
Collaborator

@Mohamed-Ashraf273, btw just yesterday new Gigachat 3.1 released: ai-sage/GigaChat3.1-10B-A1.8B-bf16
Can you please check this on your PR? If something is needed, I will create separate good first issue to cover it in separate PR.

@Mohamed-Ashraf273
Copy link
Copy Markdown
Contributor Author

@Mohamed-Ashraf273, please confirm that you have tested a real model (not tiny one) and the generation results match to the reference. I am asking it because I am not sure if past_key)values are updated,

@rkazants
Yeah, you're right! I'm currently working on a few issues. Thanks for the feedback!

@Mohamed-Ashraf273
Copy link
Copy Markdown
Contributor Author

@Mohamed-Ashraf273, btw just yesterday new Gigachat 3.1 released: ai-sage/GigaChat3.1-10B-A1.8B-bf16 Can you please check this on your PR? If something is needed, I will create separate good first issue to cover it in separate PR.

Got it, gonna check it after solving the issues. Thanks!

@Mohamed-Ashraf273
Copy link
Copy Markdown
Contributor Author

Mohamed-Ashraf273 commented Mar 26, 2026

@rkazants , @popovaan
For model=ai-sage/GigaChat3-10B-A1.8B-bf16
I've fixed some issues in the model that were causing low similarity <=0.1 after reverting refactoring.
Now similarity for full test (27 samples) is 0.9218938
However, using chat-template gives a higher similarity of 0.97.

image

and for fast-test (4 samples): 0.9480462.

image

Test script:

import argparse
from importlib.resources import files
from pathlib import Path

import torch
import whowhatbench
import yaml
from optimum.intel.openvino import OVModelForCausalLM
from transformers import AutoModelForCausalLM, AutoTokenizer


DEFAULT_MODEL_ID = "ai-sage/GigaChat3-10B-A1.8B-bf16"
DEFAULT_MODEL_DIR = "./output_dir"
DEFAULT_MAX_NEW_TOKENS = 128
FAST_PROMPTS = [
    "Who is the most famous programmer?",
    "Who is Leo Tolstoy?",
    "Explain what artificial intelligence is.",
    "What is deep learning?",
]


def load_full_prompts():
    prompt_path = files("whowhatbench.prompts").joinpath("text_prompts.yaml")
    prompt_data = yaml.safe_load(prompt_path.read_text(encoding="utf-8"))
    return prompt_data["en"]["prompts"]


def build_generation_kwargs(tokenizer, max_new_tokens: int):
    eos_token_id = tokenizer.eos_token_id
    pad_token_id = tokenizer.pad_token_id if tokenizer.pad_token_id is not None else eos_token_id
    bos_token_id = tokenizer.bos_token_id

    generation_kwargs = {
        "do_sample": False,
        "num_beams": 1,
        "max_new_tokens": max_new_tokens,
        "use_cache": True,
        "use_model_defaults": False,
    }

    if eos_token_id is not None:
        generation_kwargs["eos_token_id"] = eos_token_id
    if pad_token_id is not None:
        generation_kwargs["pad_token_id"] = pad_token_id
    if bos_token_id is not None:
        generation_kwargs["bos_token_id"] = bos_token_id

    return generation_kwargs


def prepare_inputs(model, tokenizer, prompt: str):
    device = getattr(model, "device", "cpu")
    inputs = tokenizer(prompt, return_tensors="pt").to(device)
    inputs.pop("token_type_ids", None)
    return inputs


def generate_answer(
    model,
    tokenizer,
    prompt,
    max_new_tokens,
    crop_question,
    use_chat_template=False,
    empty_adapters=False,
    num_assistant_tokens=0,
    assistant_confidence_threshold=0.0,
):
    del crop_question, use_chat_template, empty_adapters, num_assistant_tokens, assistant_confidence_threshold
    inputs = prepare_inputs(model, tokenizer, prompt)
    tokens = model.generate(**inputs, **build_generation_kwargs(tokenizer, max_new_tokens))
    prompt_len = inputs["input_ids"].shape[-1]
    return tokenizer.decode(tokens[0, prompt_len:], skip_special_tokens=True)


def load_models(model_id: str, model_dir: str):
    model_dir_path = Path(model_dir).expanduser().resolve()
    if not model_dir_path.exists():
        raise FileNotFoundError(
            f"OpenVINO model directory was not found: {model_dir_path}. "
            "Run `python demo.py export` first or pass the correct exported model path."
        )

    tokenizer = AutoTokenizer.from_pretrained(model_id)
    base_model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16)
    optimized_model = OVModelForCausalLM.from_pretrained(
        str(model_dir_path),
        use_cache=True,
        load_in_8bit=False,
        quantization_config=None,
    )
    return tokenizer, base_model, optimized_model


def export_model(model_id: str, model_dir: str):
    model_dir_path = Path(model_dir).expanduser().resolve()
    model_dir_path.mkdir(parents=True, exist_ok=True)

    optimized_model = OVModelForCausalLM.from_pretrained(
        model_id,
        export=True,
        use_cache=True,
        load_in_8bit=False,
        quantization_config=None,
    )

    optimized_model.save_pretrained(str(model_dir_path))
    print(f"Saved OpenVINO model to {model_dir_path}")


def run_test(model_id: str, model_dir: str, prompts, max_new_tokens: int, top_k: int):
    tokenizer, base_model, optimized_model = load_models(model_id, model_dir)

    evaluator = whowhatbench.TextEvaluator(
        base_model=base_model,
        tokenizer=tokenizer,
        test_data=prompts,
        max_new_tokens=max_new_tokens,
        use_chat_template=False,
        gen_answer_fn=generate_answer,
    )
    _, metrics = evaluator.score(optimized_model, gen_answer_fn=generate_answer)

    print("similarity:", metrics["similarity"][0])
    print()
    print("Worst examples:")
    for example in evaluator.worst_examples(top_k=top_k, metric="similarity"):
        print("=========================")
        print("Prompt:", example["prompt"])
        print("Baseline:", example["source_model"])
        print("Optimized:", example["optimized_model"])
        print()


def build_parser():
    parser = argparse.ArgumentParser(description="Simple GigaChat3 OpenVINO demo.")
    subparsers = parser.add_subparsers(dest="command", required=True)

    export_parser = subparsers.add_parser("export")
    export_parser.add_argument("--model-id", default=DEFAULT_MODEL_ID)
    export_parser.add_argument("--model-dir", default=DEFAULT_MODEL_DIR)

    test_parser = subparsers.add_parser("test")
    test_parser.add_argument("--model-id", default=DEFAULT_MODEL_ID)
    test_parser.add_argument("--model-dir", default=DEFAULT_MODEL_DIR)
    test_parser.add_argument("--max-new-tokens", type=int, default=DEFAULT_MAX_NEW_TOKENS)
    test_parser.add_argument("--top-k", type=int, default=5)

    fast_test_parser = subparsers.add_parser("fast-test")
    fast_test_parser.add_argument("--model-id", default=DEFAULT_MODEL_ID)
    fast_test_parser.add_argument("--model-dir", default=DEFAULT_MODEL_DIR)
    fast_test_parser.add_argument("--max-new-tokens", type=int, default=DEFAULT_MAX_NEW_TOKENS)
    fast_test_parser.add_argument("--top-k", type=int, default=5)

    return parser


def main():
    args = build_parser().parse_args()

    if args.command == "export":
        export_model(args.model_id, args.model_dir)
        return

    if args.command == "test":
        run_test(
            model_id=args.model_id,
            model_dir=args.model_dir,
            prompts=load_full_prompts(),
            max_new_tokens=args.max_new_tokens,
            top_k=args.top_k,
        )
        return

    run_test(
        model_id=args.model_id,
        model_dir=args.model_dir,
        prompts=FAST_PROMPTS,
        max_new_tokens=args.max_new_tokens,
        top_k=args.top_k,
    )


if __name__ == "__main__":
    main()

How reproduce tests:

python demo.py export
python demo.py test
python demo.py fast-test

@Mohamed-Ashraf273
Copy link
Copy Markdown
Contributor Author

Mohamed-Ashraf273 commented Mar 26, 2026

@Mohamed-Ashraf273, btw just yesterday new Gigachat 3.1 released: ai-sage/GigaChat3.1-10B-A1.8B-bf16 Can you please check this on your PR? If something is needed, I will create separate good first issue to cover it in separate PR.

@rkazants,
I checked ai-sage/GigaChat3.1-10B-A1.8B-bf16, and it exported successfully.
However, I ran out of resources, so I’m unable to run a full similarity test at the moment.

@Mohamed-Ashraf273
Copy link
Copy Markdown
Contributor Author

@rkazants , @popovaan
Also, here are tests passing using transformers==4.57.6

(env) mohamed-ashraf@mohamed-ashraf-LOQ-15IRX9:~/Desktop/projects/GSoC26/optimum-intel$ python -m pytest tests/openvino/*.py -xvs -k "deepseek or gigachat" 2>&1
============================================================================= test session starts =============================================================================
platform linux -- Python 3.12.3, pytest-7.4.4, pluggy-1.6.0 -- /home/mohamed-ashraf/Desktop/projects/env/bin/python
cachedir: .pytest_cache
rootdir: /home/mohamed-ashraf/Desktop/projects/GSoC26/optimum-intel
configfile: pyproject.toml
plugins: anyio-4.12.1, mock-3.15.1, langsmith-0.7.10
collecting ... Multiple distributions found for package optimum. Picked distribution: optimum-onnx
collecting 693 items                                                                                                                                                          The provided dataset won't have any effect on the resulting compressed model because no data-aware quantization algorithm is selected and compression ratio is 1.0.
The provided dataset won't have any effect on the resulting compressed model because no data-aware quantization algorithm is selected and compression ratio is 1.0.
collected 1279 items / 1268 deselected / 11 selected                                                                                                                          

tests/openvino/test_decoder.py::OVModelForCausalLMIntegrationTest::test_beam_search_60_deepseek SKIPPED (test is slow)
tests/openvino/test_decoder.py::OVModelForCausalLMIntegrationTest::test_beam_search_61_gigachat3 SKIPPED (test is slow)
tests/openvino/test_decoder.py::OVModelForCausalLMIntegrationTest::test_compare_to_transformers_60_deepseek `torch_dtype` is deprecated! Use `dtype` instead!
'(ReadTimeoutError("HTTPSConnectionPool(host='huggingface.co', port=443): Read timed out. (read timeout=10)"), '(Request ID: 26952585-b835-4367-94b0-a369117016bd)')' thrown while requesting HEAD https://huggingface.co/optimum-intel-internal-testing/tiny-random-deepseek-v3/resolve/main/video_preprocessor_config.json
Retrying in 1s [Retry 1/5].
'(ReadTimeoutError("HTTPSConnectionPool(host='huggingface.co', port=443): Read timed out. (read timeout=10)"), '(Request ID: 8b3c14ad-57bb-4dfb-bfe5-980ed6358d15)')' thrown while requesting HEAD https://huggingface.co/optimum-intel-internal-testing/tiny-random-deepseek-v3/resolve/main/video_preprocessor_config.json
Retrying in 2s [Retry 2/5].
This model works best with OpenVINO 2026.1 or later. Earlier versions require float() conversion for MoE weights, which may affect performance. OpenVINO 2026.1 includes a fix for torch.bmm dtype handling.
`loss_type=None` was set in the config but it is unrecognized. Using the default loss: `ForCausalLMLoss`.
`generation_config` default values have been modified to match model-specific defaults: {'temperature': 0.3, 'top_p': 0.95, 'bos_token_id': 100000}. If this is not desired, please set these values explicitly.
The following generation flags are not valid and may be ignored: ['temperature', 'top_p']. Set `TRANSFORMERS_VERBOSITY=info` for more details.
PASSED
tests/openvino/test_decoder.py::OVModelForCausalLMIntegrationTest::test_compare_to_transformers_61_gigachat3 '(ReadTimeoutError("HTTPSConnectionPool(host='huggingface.co', port=443): Read timed out. (read timeout=10)"), '(Request ID: 6ffef073-4ec6-45d6-aefb-0bf51b51756f)')' thrown while requesting HEAD https://huggingface.co/optimum-intel-internal-testing/tiny-random-gigachat3/resolve/main/config.json
Retrying in 1s [Retry 1/5].
This model works best with OpenVINO 2026.1 or later. Earlier versions require float() conversion for MoE weights, which may affect performance. OpenVINO 2026.1 includes a fix for torch.bmm dtype handling.
`generation_config` default values have been modified to match model-specific defaults: {'bos_token_id': 1}. If this is not desired, please set these values explicitly.
PASSED
tests/openvino/test_decoder.py::OVModelForCausalLMIntegrationTest::test_pipeline_60_deepseek SKIPPED (test is slow)
tests/openvino/test_decoder.py::OVModelForCausalLMIntegrationTest::test_pipeline_61_gigachat3 SKIPPED (test is slow)
tests/openvino/test_export.py::ExportModelTest::test_export_26_deepseek This model works best with OpenVINO 2026.1 or later. Earlier versions require float() conversion for MoE weights, which may affect performance. OpenVINO 2026.1 includes a fix for torch.bmm dtype handling.
This model works best with OpenVINO 2026.1 or later. Earlier versions require float() conversion for MoE weights, which may affect performance. OpenVINO 2026.1 includes a fix for torch.bmm dtype handling.
PASSED
tests/openvino/test_export.py::ExportModelTest::test_export_27_gigachat3 This model works best with OpenVINO 2026.1 or later. Earlier versions require float() conversion for MoE weights, which may affect performance. OpenVINO 2026.1 includes a fix for torch.bmm dtype handling.
This model works best with OpenVINO 2026.1 or later. Earlier versions require float() conversion for MoE weights, which may affect performance. OpenVINO 2026.1 includes a fix for torch.bmm dtype handling.
PASSED
tests/openvino/test_quantization.py::OVQuantizationConfigTest::test_named_default_configurations_49_deepseek_ai_DeepSeek_R1_Distill_Qwen_1_5B PASSED
tests/openvino/test_quantization.py::OVQuantizationConfigTest::test_named_default_configurations_50_deepseek_ai_DeepSeek_R1_Distill_Qwen_7B PASSED
tests/openvino/test_quantization.py::OVQuantizationConfigTest::test_named_default_configurations_51_deepseek_ai_DeepSeek_R1_Distill_Llama_8B PASSED

================================================================================= warnings summary ==================================================================================
../../../../../../usr/lib/python3.12/multiprocessing/popen_fork.py:66: 1 warning
tests/openvino/test_decoder.py: 118 warnings
tests/openvino/test_export.py: 234 warnings
  /usr/lib/python3.12/multiprocessing/popen_fork.py:66: DeprecationWarning: This process (pid=471547) is multi-threaded, use of fork() may lead to deadlocks in the child.
    self.pid = os.fork()

<frozen importlib._bootstrap>:488
  <frozen importlib._bootstrap>:488: DeprecationWarning: builtin type SwigPyPacked has no __module__ attribute

<frozen importlib._bootstrap>:488
  <frozen importlib._bootstrap>:488: DeprecationWarning: builtin type SwigPyObject has no __module__ attribute

../../env/lib/python3.12/site-packages/torch/jit/_script.py:1480
  /home/mohamed-ashraf/Desktop/projects/env/lib/python3.12/site-packages/torch/jit/_script.py:1480: DeprecationWarning: `torch.jit.script` is deprecated. Please switch to `torch.compile` or `torch.export`.
    warnings.warn(

tests/openvino/test_decoder.py::OVModelForCausalLMIntegrationTest::test_compare_to_transformers_60_deepseek
tests/openvino/test_decoder.py::OVModelForCausalLMIntegrationTest::test_compare_to_transformers_61_gigachat3
tests/openvino/test_export.py::ExportModelTest::test_export_26_deepseek
tests/openvino/test_export.py::ExportModelTest::test_export_26_deepseek
tests/openvino/test_export.py::ExportModelTest::test_export_27_gigachat3
tests/openvino/test_export.py::ExportModelTest::test_export_27_gigachat3
  /home/mohamed-ashraf/Desktop/projects/env/lib/python3.12/site-packages/torch/jit/_trace.py:1000: DeprecationWarning: `torch.jit.trace` is deprecated. Please switch to `torch.compile` or `torch.export`.
    warnings.warn(

tests/openvino/test_decoder.py::OVModelForCausalLMIntegrationTest::test_compare_to_transformers_60_deepseek
tests/openvino/test_decoder.py::OVModelForCausalLMIntegrationTest::test_compare_to_transformers_61_gigachat3
tests/openvino/test_export.py::ExportModelTest::test_export_26_deepseek
tests/openvino/test_export.py::ExportModelTest::test_export_26_deepseek
tests/openvino/test_export.py::ExportModelTest::test_export_27_gigachat3
tests/openvino/test_export.py::ExportModelTest::test_export_27_gigachat3
  /home/mohamed-ashraf/Desktop/projects/env/lib/python3.12/site-packages/torch/jit/_trace.py:1139: DeprecationWarning: `torch.jit.trace_method` is deprecated. Please switch to `torch.compile` or `torch.export`.
    warnings.warn(

tests/openvino/test_decoder.py::OVModelForCausalLMIntegrationTest::test_compare_to_transformers_60_deepseek
tests/openvino/test_decoder.py::OVModelForCausalLMIntegrationTest::test_compare_to_transformers_61_gigachat3
tests/openvino/test_export.py::ExportModelTest::test_export_26_deepseek
tests/openvino/test_export.py::ExportModelTest::test_export_27_gigachat3
  /home/mohamed-ashraf/Desktop/projects/env/lib/python3.12/site-packages/transformers/cache_utils.py:132: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
    if not self.is_initialized or self.keys.numel() == 0:

tests/openvino/test_decoder.py::OVModelForCausalLMIntegrationTest::test_compare_to_transformers_60_deepseek
tests/openvino/test_decoder.py::OVModelForCausalLMIntegrationTest::test_compare_to_transformers_61_gigachat3
tests/openvino/test_export.py::ExportModelTest::test_export_26_deepseek
tests/openvino/test_export.py::ExportModelTest::test_export_27_gigachat3
  /home/mohamed-ashraf/Desktop/projects/env/lib/python3.12/site-packages/transformers/masking_utils.py:207: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
    if (padding_length := kv_length + kv_offset - attention_mask.shape[-1]) > 0:

tests/openvino/test_decoder.py::OVModelForCausalLMIntegrationTest::test_compare_to_transformers_60_deepseek
tests/openvino/test_decoder.py::OVModelForCausalLMIntegrationTest::test_compare_to_transformers_61_gigachat3
tests/openvino/test_export.py::ExportModelTest::test_export_26_deepseek
tests/openvino/test_export.py::ExportModelTest::test_export_27_gigachat3
  /home/mohamed-ashraf/Desktop/projects/GSoC26/optimum-intel/optimum/exporters/openvino/model_patcher.py:240: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
    torch.tensor(0.0, device=mask.device, dtype=dtype),

tests/openvino/test_decoder.py::OVModelForCausalLMIntegrationTest::test_compare_to_transformers_60_deepseek
tests/openvino/test_decoder.py::OVModelForCausalLMIntegrationTest::test_compare_to_transformers_61_gigachat3
tests/openvino/test_export.py::ExportModelTest::test_export_26_deepseek
tests/openvino/test_export.py::ExportModelTest::test_export_27_gigachat3
  /home/mohamed-ashraf/Desktop/projects/GSoC26/optimum-intel/optimum/exporters/openvino/model_patcher.py:241: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
    torch.tensor(torch.finfo(torch.float16).min, device=mask.device, dtype=dtype),

tests/openvino/test_decoder.py: 58 warnings
  /home/mohamed-ashraf/Desktop/projects/GSoC26/optimum-intel/optimum/intel/openvino/modeling_decoder.py:820: DeprecationWarning: __array__ implementation doesn't accept a copy keyword, so passing copy=False failed. __array__ must implement 'dtype' and 'copy' keyword arguments. To learn more, see the migration guide https://numpy.org/devdocs/numpy_2_0_migration_guide.html#adapting-to-changes-in-the-copy-keyword
    np.array(beam_idx) if not self._second_iter_beam_search else self.next_beam_idx

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
====================================================== 7 passed, 4 skipped, 1268 deselected, 442 warnings in 139.54s (0:02:19) ======================================================
(env) mohamed-ashraf@mohamed-ashraf-LOQ-15IRX9:~/Desktop/projects/GSoC26/optimum-intel$ 

Format test:

(env) mohamed-ashraf@mohamed-ashraf-LOQ-15IRX9:~/Desktop/projects/GSoC26/optimum-intel$ ruff check optimum/ --config pyproject.toml
warning: The top-level linter settings are deprecated in favour of their counterparts in the `lint` section. Please update the following options in `pyproject.toml`:
  - 'ignore' -> 'lint.ignore'
  - 'select' -> 'lint.select'
  - 'isort' -> 'lint.isort'
  - 'per-file-ignores' -> 'lint.per-file-ignores'
All checks passed!
(env) mohamed-ashraf@mohamed-ashraf-LOQ-15IRX9:~/Desktop/projects/GSoC26/optimum-intel$ 

@savvadesogle
Copy link
Copy Markdown

Hello. Could you please tell me at what point can I start converting a model and not update the converted model anymore? At what point is everything considered secure?
I mean, after what point is everything definitely working?
Or do I need to wait until it's merged with the main branch?

Sorry for the inconvenience.

@Mohamed-Ashraf273
Copy link
Copy Markdown
Contributor Author

@savvadesogle
It should be stable to use now.
I’ve validated it with tests and shared the results above, and everything is working as expected.
I don’t anticipate further changes unless something is flagged during review.

@Mohamed-Ashraf273
Copy link
Copy Markdown
Contributor Author

Hi @rkazants , @popovaan

I’ve completed all the requested revisions. Could you please take a look?
Let me know if there’s anything else I should update or improve.
Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants