Skip to content

refactor mixtral moe block.#1635

Merged
regisss merged 2 commits intohuggingface:v1.15-releasefrom
lkk12014402:update_mixtral_moe_block
Dec 20, 2024
Merged

refactor mixtral moe block.#1635
regisss merged 2 commits intohuggingface:v1.15-releasefrom
lkk12014402:update_mixtral_moe_block

Conversation

@lkk12014402
Copy link
Copy Markdown
Contributor

What does this PR do?

moe block forward regresion brought by #1511 for training

@lkk12014402 lkk12014402 requested a review from regisss as a code owner December 19, 2024 07:01
@lkk12014402
Copy link
Copy Markdown
Contributor Author

the pr fix the segmentation fault issue that caused by DynamicMoE from this #1511, when train mixtral model

@lkk12014402
Copy link
Copy Markdown
Contributor Author

training

DEEPSPEED_HPU_ZERO3_SYNC_MARK_STEP_REQUIRED=1 python ../gaudi_spawn.py --world_size 4 --use_deepspeed sft.py \
    --model_name_or_path mistralai/Mixtral-8x7B-Instruct-v0.1 \
    --dataset_name "philschmid/dolly-15k-oai-style" \
    --subset 'data/' \
    --streaming False \
    --deepspeed ../language-modeling/llama2_ds_zero3_config.json \
    --output_dir="./model_mixtral" \
    --do_train \
    --max_steps=500 \
    --logging_steps=10 \
    --save_steps=100 \
    --per_device_train_batch_size=2 \
    --per_device_eval_batch_size=1 \
    --gradient_accumulation_steps=2 \
    --learning_rate=1e-4 \
    --lr_scheduler_type="cosine" \
    --warmup_steps=100 \
    --weight_decay=0.05 \
    --optim="paged_adamw_32bit" \
    --lora_target_modules "q_proj" "v_proj" \
    --bf16 \
    --remove_unused_columns=False \
    --max_seq_length 512 \
    --run_name="sft_mixtral" \
    --report_to=none \
    --use_habana \
    --use_lazy_mode

image

@lkk12014402
Copy link
Copy Markdown
Contributor Author

inference

QUANT_CONFIG=./quantization_config/maxabs_measure.json python run_generation.py --model_name_or_path "mistralai/Mixtral-8x7B-Instruct-v0.1" --use_hpu_graphs --use_kv_cache --limit_hpu_graphs --bucket_size 128 --max_new_tokens 128 --batch_size 1 --bf16

QUANT_CONFIG=./quantization_config/maxabs_quant_mixtral.json python run_generation.py --model_name_or_path "mistralai/Mixtral-8x7B-Instruct-v0.1" --use_hpu_graphs --use_kv_cache --limit_hpu_graphs --bucket_size 128 --max_new_tokens 128 --batch_size 1 --bf16

image

python run_generation.py --model_name_or_path "mistralai/Mixtral-8x7B-Instruct-v0.1" --use_hpu_graphs --use_kv_cache --limit_hpu_graphs --bucket_size 128 --max_new_tokens 512 --batch_size 4 --bf16

image

@libinta libinta added the run-test Run CI for PRs from external contributors label Dec 19, 2024
@regisss regisss merged commit c8abbca into huggingface:v1.15-release Dec 20, 2024
12010486 added a commit to 12010486/optimum-habana that referenced this pull request Dec 20, 2024
regisss pushed a commit that referenced this pull request Dec 23, 2024
zzhang37 pushed a commit to zzhang37/optimum-habana that referenced this pull request Jan 7, 2025
huijuanzh pushed a commit to huijuanzh/optimum-habana that referenced this pull request Jan 7, 2025
Liangyx2 pushed a commit to HabanaAI/optimum-habana-fork that referenced this pull request Jan 20, 2025
xinyu-intel pushed a commit to HabanaAI/optimum-habana-fork that referenced this pull request Mar 4, 2025
* Add flag to run inference with partial dataset (huggingface#1420)

* Add peft generation example (huggingface#1427)

* Upgrade to SynapseAI 1.18.0 (huggingface#1418)

* Simplify HQT config files (huggingface#1219)

* unify_measurements.py script support to unify PCQ 70B 8x (huggingface#1322)

* Add misc. training args (huggingface#1346)

* Add quantization config for low bs case (huggingface#1377)

* Remove HQT from OHF (huggingface#1257)

Co-authored-by: Adam Stachowicz <[email protected]>
Co-authored-by: Adam Stachowicz <[email protected]>
Co-authored-by: Yeonsil Yoon <[email protected]>

* Load INC GPTQ checkpoint & rename params (huggingface#1364)

Co-authored-by: Yaser Afshar <[email protected]>
Co-authored-by: Harish Subramony <[email protected]>
Co-authored-by: Yeonsil Yoon <[email protected]>

* Enable FusedSDPA fp8 in Llama FT (huggingface#1388)

Co-authored-by: Yaser Afshar <[email protected]>
Co-authored-by: Harish Subramony <[email protected]>

* Valid sequence length for sdpa (huggingface#1183)

Co-authored-by: Harish <[email protected]>
Co-authored-by: Libin Tang <[email protected]>
Co-authored-by: regisss <[email protected]>

* Multiple fixes (dynamo graph break, qwen-moe, multicard) (huggingface#1410)

* datasets downgrade version to 2.21.0 (huggingface#1413)

* Update ci sentence_transformer.sh (huggingface#1424)

* Fix load INC load weights compile error due to Transformer 4.45 upgrade.  (huggingface#1421)

* Update language-modeling README.md, add trust_remote_code for flan-t5-xl (huggingface#1422)

* Update unify_measurements.py support info (huggingface#1425)

* GPT2 torch.compile fix (huggingface#1434)

* Added missing allocate_kv_cache() call in CausalLM class (huggingface#1431)

* Fix merge error and update text-to-speech readme (huggingface#1436)

* Fix OOM error for code llama (huggingface#1437)

* Fix error on 4bit checkpoint load with run_lm_eval on TF4.45.2 (huggingface#1439)

* Fix scoped linear all-reduce for starcoder model (huggingface#1432)

* Fixed recursion error in SentenceTransformer (huggingface#1428)

* Fix Llama 3.1 generation (huggingface#1444)

* Update text-gen README.md to add auto-gptq fork install steps (huggingface#1442)

* Added gemma specific fp8 quantization file (huggingface#1445)

* Remove cache folder from image data folder (huggingface#1446)

Co-authored-by: regisss <[email protected]>

* Bump dev version

* Enable DeepSpeed for image-to-text example (huggingface#1455)

* Fix bug when loading 4bit checkpoint quantized in INC (huggingface#1447)

* Fixes 'Tokenizer does not have padding token' introduced by  huggingface#1444 for Llama3.1 (huggingface#1457)

* Fix facebook/hf-seamless-m4t-medium crash (huggingface#1433)

Signed-off-by: Wang, Yi A <[email protected]>

* Fix bias update in scoped all reduce (huggingface#1456)

* Added skip for unsuported tests for mistral/mixtral (huggingface#1462)

* Update sentence transformer to v3.2.1 (huggingface#1470)

* Optimized inference of Cohere model on HPU (huggingface#1329)

Signed-off-by: Ye, Xinyu <[email protected]>

* Idefics2 (huggingface#1270)

Signed-off-by: Wang, Yi A <[email protected]>

* Remove deprecated Mixed precision flags (huggingface#1471)

Change-Id: I1c2e2460dc2072ba7b311f239441b304694918c8

* Optimized inference of XGLM model on HPU (huggingface#1323)

Signed-off-by: Ye, Xinyu <[email protected]>

* Add mllama support (huggingface#1419)

Signed-off-by: Wang, Yi A <[email protected]>

* Enable flash attention for gemma (huggingface#1454)

* Readme: replace tabs with spaces (huggingface#1485)

* Move fast tests to Gaudi2 (huggingface#1498)

* Support loading 4 bit Qwen2 (huggingface#1476)

Signed-off-by: Mengni Wang <[email protected]>

* Add textual inversion XL for Gaudi (huggingface#868)

Signed-off-by: Daniel Socek <[email protected]>
Co-authored-by: Iman Gohari <[email protected]>

* Remove torch req from LM example (huggingface#1491)

* Remove keep_input_mutations (huggingface#1492)

* Fix trust_remote_code (huggingface#1493)

* Upgrade ViT README with torch.compile (huggingface#1494)

* Tests for text gen output text (huggingface#1411)

* Corrected Throughput measure for GaudiDDPMPipeline (huggingface#1460)

* Fix text generation test

* Add G3 in T5-L README (huggingface#1523)

* Fix tuple object error (huggingface#1354)

* Add warmup time and compile time log for the eval/prediction.  (huggingface#1489)

* Fix style

* Enable `paligemma` model for image-to-text example (huggingface#1407)

Signed-off-by: Liu, Kaixuan <[email protected]>
Co-authored-by: regisss <[email protected]>

* Add support for MLPERF optimized pipeline from example (huggingface#1465)

Co-authored-by: sushil dubey <[email protected]>

* Enable Gemma2 Inference on Gaudi (huggingface#1504)

Signed-off-by: Wang, Yi A <[email protected]>
Signed-off-by: Ye, Xinyu <[email protected]>
Signed-off-by: Mengni Wang <[email protected]>
Signed-off-by: Daniel Socek <[email protected]>
Co-authored-by: billishyahao <[email protected]>
Co-authored-by: Harish Subramony <[email protected]>
Co-authored-by: Yeonsil Yoon <[email protected]>
Co-authored-by: Seunghyuk Park (shepark) <[email protected]>
Co-authored-by: regisss <[email protected]>
Co-authored-by: Sun Choi <[email protected]>
Co-authored-by: xinhe <[email protected]>
Co-authored-by: Mohit Deopujari <[email protected]>
Co-authored-by: Wang, Yi <[email protected]>
Co-authored-by: Soila Kavulya <[email protected]>
Co-authored-by: Iman Gohari <[email protected]>
Co-authored-by: ZhengHongming888 <[email protected]>
Co-authored-by: XinyuYe-Intel <[email protected]>
Co-authored-by: Vivek Goel <[email protected]>
Co-authored-by: Akihiro Takahashi <[email protected]>
Co-authored-by: Miroslav Goncharenko <[email protected]>
Co-authored-by: Wang, Mengni <[email protected]>
Co-authored-by: Daniel Socek <[email protected]>
Co-authored-by: Adam Stachowicz <[email protected]>
Co-authored-by: Vidya Galli <[email protected]>
Co-authored-by: deepak-gowda-narayana <[email protected]>

* Add check_neural_compressor_min_version for 4 bit behavior (huggingface#1500)

Signed-off-by: Xin <[email protected]>
Signed-off-by: xinhe3 <[email protected]>
Co-authored-by: xinhe3 <[email protected]>

* Fixed Gemma FP8 flash_attention lower throughput issue (huggingface#1510)

* Pass "lazy_mode" arg to GaudiLlamaModel GaudiTrainer (huggingface#1515)

Co-authored-by: Marcin Łapiński <[email protected]>

* Removed workaround for NaN bug causing graph break. (huggingface#1516)

Co-authored-by: Marcin Łapiński <[email protected]>

* Disable default sdpa in Albert (#22) (huggingface#1517)

Co-authored-by: Urszula Golowicz <[email protected]>

* Implement fused sdpa for wav2vec2 (#18) (huggingface#1520)

* Memory optimization for gpt_bitcode (#4) (huggingface#1513)

Co-authored-by: Urszula Golowicz <[email protected]>

* text_generation: improve parameters check (huggingface#1527)

* transformers: fixed some typos (huggingface#1528)

* Update DeepSpeed CI baselines

* Update FSDP CI baseline

* Optimum-Habana docs re-org (huggingface#1488)

Signed-off-by: Daniel Socek <[email protected]>
Co-authored-by: Greg Serochi <[email protected]>
Co-authored-by: Kiangpeng Lau <[email protected]>
Co-authored-by: Seethong Vang <[email protected]>
Co-authored-by: regisss <[email protected]>
Co-authored-by: Anastasia Uvarova <[email protected]>
Co-authored-by: Mohit Deopujari <[email protected]>
Co-authored-by: Chen Levkovich <[email protected]>
Co-authored-by: Libin Tang <[email protected]>

* Makes the with_stack of the profiler changeable (huggingface#1497)

* FLUX with diffusers 0.31.0 (huggingface#1450)

Signed-off-by: Daniel Socek <[email protected]>
Co-authored-by: Baochen Yang <[email protected]>
Co-authored-by: Huijuan Zhou <[email protected]>
Co-authored-by: Sergey Plotnikov <[email protected]>
Co-authored-by: Deepak Narayana <[email protected]>
Co-authored-by: regisss <[email protected]>

* Fix some CI baselines

* Add split runners to CI (2 devices per runner for fast tests)

* Fix fast CI to work with split runners (huggingface#1534)

* Fix dtype issue with valid sequence length in torch.compile bs=1 (huggingface#1532)

* Support beam search with reuse_cache and bucket_internal (huggingface#1472)

* Add mixtral trl sft (huggingface#1349)

* Enable tiiuae/falcon-11B-vlm in image_to_text example (huggingface#1490)

Signed-off-by: Wang, Yi A <[email protected]>

* Add Llama 3.1 ft to CI (huggingface#1529)

* Migrate OH CLIP (roberta-clip) training to torch.compile (huggingface#1507)

* test_text_generation: fix non-Gaudi2 case (huggingface#1530)

* text-generation: improve output printing (huggingface#1486)

* Text-generation, model set-up: torch.compile for attributes instead of models' types (huggingface#1452)

* FLUX Fine-Tuning for Gaudi (huggingface#1482)

Signed-off-by: Daniel Socek <[email protected]>

* Enable fusedsdpa kernel for vision part of mllama (huggingface#1531)

Signed-off-by: Wang, Yi A <[email protected]>

* Minicpm enabling (huggingface#1342)

Signed-off-by: Daniel Huang <[email protected]>

* Fix bridgetower example (#312) (huggingface#1481)

* Migrate OH Wave2Vec-AC training to torch.compile - README update (huggingface#1537)

Co-authored-by: Chaojun Zhang <[email protected]>

* Flux Image-To-Image pipeline (huggingface#1524)

Signed-off-by: Daniel Socek <[email protected]>
Co-authored-by: Iman Gohari <[email protected]>

* Enable Falcon-mamba (huggingface#1480)

Signed-off-by: yuanwu <[email protected]>
Co-authored-by: regisss <[email protected]>

* Enable dynamic compile for mpi(training) (huggingface#1509)

* Migrate OH T5-large training to torch.compile (huggingface#1506)

* Add support for Baichuan2 (huggingface#1479)

Signed-off-by: Haihao Xiang <[email protected]>
Co-authored-by: Jianqian Zhou <[email protected]>
Co-authored-by: Wei Lin <[email protected]>

* trainer: fixed spelling (huggingface#1538)

* Create CI Eager/Lazy for Language Modeling (huggingface#1448)

* Fixes for llava-next test failures in 1.19 (huggingface#1535)

Co-authored-by: regisss <[email protected]>

* Enable DeepSeek-V2 (huggingface#1475)

Signed-off-by: Matrix YAO <[email protected]>

* Refactor Qwen2 Family (huggingface#1541)

* Add support for optimized SDXL pipeline (huggingface#1519)

* Make style

* Add the checkout parameters of falcon-mamba pytest (huggingface#1540)

Signed-off-by: yuanwu <[email protected]>
Co-authored-by: regisss <[email protected]>

* Avoid negative values in eval metrics (huggingface#1533)

* Fixes in unify_measurements (huggingface#1496)

Co-authored-by: yan tomsinsky <[email protected]>
Co-authored-by: Eran Geva <[email protected]>

* Fix lm_eval script for starcoder and gemma (huggingface#1463)

* Add option to use bf16 in PT sdp (#5) (huggingface#1514)

Co-authored-by: Urszula Golowicz <[email protected]>

* Fix tests.test_peft_inference failure (huggingface#1543)

Signed-off-by: Wang, Yi A <[email protected]>

* [wav2vec2] Remove tensor.item and dynamic slicing operations in the loop that cause graph break (huggingface#1508)

* Update lm_eval version (huggingface#1473)

Co-authored-by: regisss <[email protected]>

* Fix lm_eval script for starcoder and gemma (huggingface#1463)

* Add option to use bf16 in PT sdp (#5) (huggingface#1514)

Co-authored-by: Urszula Golowicz <[email protected]>

* Fix tests.test_peft_inference failure (huggingface#1543)

Signed-off-by: Wang, Yi A <[email protected]>

* Update lm_eval version (huggingface#1473)

Co-authored-by: regisss <[email protected]>

* Fix bad import in Baichuan code (huggingface#1547)

* Restore performance in generate (huggingface#1546)

Signed-off-by: Urszula Golowicz <[email protected]>
Co-authored-by: Marcin Łapiński <[email protected]>
Co-authored-by: Adam Stachowicz <[email protected]>

* Enable pyTorch-IMage-Models (TIMM) with HPUs (huggingface#1459)

Co-authored-by: regisss <[email protected]>

* Add HF login for 8x Gaudi2 CI

* Adding support for Context Parallelism using Deepseed's DistributedAttention (huggingface#1501)

Co-authored-by: regisss <[email protected]>

* Fix bad import in Baichuan code (huggingface#1547)

* Restore performance in generate (huggingface#1546)

Signed-off-by: Urszula Golowicz <[email protected]>
Co-authored-by: Marcin Łapiński <[email protected]>
Co-authored-by: Adam Stachowicz <[email protected]>

* Enable pyTorch-IMage-Models (TIMM) with HPUs (huggingface#1459)

Co-authored-by: regisss <[email protected]>

* Add HF login for 8x Gaudi2 CI

* Adding support for Context Parallelism using Deepseed's DistributedAttention (huggingface#1501)

Co-authored-by: regisss <[email protected]>

* Fix Llama CI

* Fix Llama CI

* Add DynamicMoE support for Mixtral (huggingface#1511)

Co-authored-by: Adam Stachowicz <[email protected]>

* Fix for llava models not generating text with test failures in 1.19 (huggingface#1548)

* Refactor KV cache, Rope  , reduce common code  (huggingface#1148)

Co-authored-by: regisss <[email protected]>

* Adjust Qwen2-7B test case (huggingface#1551)

* [run_lm_eval.py] Fixed too many print dump json info (huggingface#1553)

Signed-off-by: Focus Luo <[email protected]>

* Fix for single_card llama7b and falcon40b CI errors (huggingface#1549)

* Implemented fusedSDPA for stable diffusion (#36) (huggingface#1545)

Co-authored-by: Yixiu Chen <[email protected]>
Co-authored-by: Libin Tang <[email protected]>

* Apply --sdp_on_bf16 to image-to-text examples (huggingface#1557)

* Fix accuracy regression in Gemma (huggingface#1556)

* Fix FusedSDPA wrapper from TransformerEngine (huggingface#1562)

* Add DynamicMoE support for Mixtral (huggingface#1511)

Co-authored-by: Adam Stachowicz <[email protected]>

* Fix for llava models not generating text with test failures in 1.19 (huggingface#1548)

* Refactor KV cache, Rope  , reduce common code  (huggingface#1148)

Co-authored-by: regisss <[email protected]>

* Adjust Qwen2-7B test case (huggingface#1551)

* [run_lm_eval.py] Fixed too many print dump json info (huggingface#1553)

Signed-off-by: Focus Luo <[email protected]>

* Fix for single_card llama7b and falcon40b CI errors (huggingface#1549)

* Implemented fusedSDPA for stable diffusion (#36) (huggingface#1545)

Co-authored-by: Yixiu Chen <[email protected]>
Co-authored-by: Libin Tang <[email protected]>

* Apply --sdp_on_bf16 to image-to-text examples (huggingface#1557)

* Fix accuracy regression in Gemma (huggingface#1556)

* Fix FusedSDPA wrapper from TransformerEngine (huggingface#1562)

* Run albert-xxlarge-v1 CI as torch.compile mode (huggingface#1563)

* Update README commands for the models to use --sdp_on_bf16 (huggingface#1566)

* Minicpm patch (huggingface#1567)

Signed-off-by: Daniel Huang <[email protected]>

* Updated gemma_2b_it CI (huggingface#1561)

Co-authored-by: regisss <[email protected]>

* Fixed Adalora Test for OH 1.15 (huggingface#1564)

* Fixed LORACP Test for OH 1.15 (huggingface#1568)

* Run albert-xxlarge-v1 CI as torch.compile mode (huggingface#1563)

* Update README commands for the models to use --sdp_on_bf16 (huggingface#1566)

* Minicpm patch (huggingface#1567)

Signed-off-by: Daniel Huang <[email protected]>

* Updated gemma_2b_it CI (huggingface#1561)

Co-authored-by: regisss <[email protected]>

* Fixed Adalora Test for OH 1.15 (huggingface#1564)

* Fixed LORACP Test for OH 1.15 (huggingface#1568)

* Add requirements.txt

* Update the baseline for 1.18 to reflect performance in 1.19 (huggingface#1571)

* Fix prefix llama ci failure (huggingface#1570)

Signed-off-by: Wang, Yi A <[email protected]>

* fusedsdpa for stable diffusion xl (huggingface#1565)

Co-authored-by: regisss <[email protected]>

* Fix prefix llama ci failure (huggingface#1570)

Signed-off-by: Wang, Yi A <[email protected]>

* Add sdp_on_bf16 to tests,text-gen (huggingface#1559)

* Fix mllama test (huggingface#1569)

Signed-off-by: Wang, Yi A <[email protected]>

* Fix lazy_mode assignment (huggingface#1558)

Co-authored-by: Yaser Afshar <[email protected]>

* Fix mllama test (huggingface#1569)

Signed-off-by: Wang, Yi A <[email protected]>

* Fix lazy_mode assignment (huggingface#1558)

Co-authored-by: Yaser Afshar <[email protected]>

* Fix diffusers import (huggingface#1574)

* Update README commands for more models to use --sdp_on_bf16 (huggingface#1575)

Co-authored-by: Libin Tang <[email protected]>

* Generation utils update (minor) (huggingface#1468)

* style: removed tabs (huggingface#1577)

* Add chatglm (huggingface#1478)

Co-authored-by: Wei Lin <[email protected]>
Co-authored-by: Jianqian Zhou <[email protected]>
Co-authored-by: Leo Zhao <[email protected]>

* Enable num_return_sequences in beam search (huggingface#1536)

* gpt_bigcode: added internal bucketing fix (huggingface#1526)

* Update the Gaudi trainer with transformers 4.45.2 (huggingface#1398)

* Revert "add check_neural_compressor_min_version for 4 bit behavior" (huggingface#1578)

* Revert PR huggingface#1473 (huggingface#1582)

* Enable num_return_sequences in beam search (huggingface#1536)

* gpt_bigcode: added internal bucketing fix (huggingface#1526)

* Revert "add check_neural_compressor_min_version for 4 bit behavior" (huggingface#1578)

* Revert PR huggingface#1473 (huggingface#1582)

* Remove deprecated env variables

* Add sdp_on_bf16 argument to CI for run_image2text_lora_finetune and a… (huggingface#1585)

* Remove unnecessary neural compressor fix for 1.19 release (huggingface#1584)

* Make style

* Fixed spelling (huggingface#1576)

* Update docs for baichuan2 training (huggingface#1586)

* Fixed spelling (huggingface#1576)

* Update docs for baichuan2 training (huggingface#1586)

* Adjust bert and roberta targets (huggingface#1588)

* Update text-gen readme for autogptq (huggingface#1589)

* Update README to Include Information on Performance Degradation and Mitigation Options (huggingface#1555)

* Fix Accuracy Calculation Issue in GPT-NeoX (huggingface#1591)

* Readme update for llama-405B (huggingface#1587)

Co-authored-by: Mohit Sinha <[email protected]>
Co-authored-by: Seunghyuk Park (shepark) <[email protected]>
Co-authored-by: regisss <[email protected]>

* Fix Accuracy Calculation Issue in GPT-NeoX (huggingface#1591)

* Add WA flag for falcon-180b to resolve text-gen critical reset error during tests (huggingface#1590)

* Add WA flag for falcon-180b to resolve text-gen critical reset error during tests (huggingface#1590)

* Add sdp_on_bf16 option to diffusers and image/audio classicifation tests (huggingface#1592)

* Update transformers tests generation util v4.45.2 (huggingface#1441)

Co-authored-by: Gustavo <gustavo.malkomes>
Co-authored-by: Yaser Afshar <[email protected]>
Co-authored-by: regisss <[email protected]>

* Update README.md (huggingface#1595)

* Limit position embeddings in inference (huggingface#1598)

Co-authored-by: Adam Stachowicz <[email protected]>

* Verify model output is provided when check_output is enabled (huggingface#1597)

* Limit position embeddings in inference (huggingface#1598)

Co-authored-by: Adam Stachowicz <[email protected]>

* Verify model output is provided when check_output is enabled (huggingface#1597)

* Update README.md (huggingface#1595)

* Fix scikit-learn to 1.5.2 to fix f1 evaluation crash in 1.6.0 (huggingface#1596)

Signed-off-by: Wang, Yi A <[email protected]>

* Revert common KVCache not to check token_idx (huggingface#1594)

* Update language-modeling README file (huggingface#1599)

Co-authored-by: Libin Tang <[email protected]>
Co-authored-by: regisss <[email protected]>

* Update readme for audio-classification example (huggingface#1602)

* SDPA flag update - static code analysis (huggingface#1601)

* Revert common KVCache not to check token_idx (huggingface#1594)

* Remove unwanted merged changes in SD pipeline

* Revert LlamaKVCache due to memory increase (huggingface#1605)

* Check rope_scaling attr (huggingface#1609)

* skip certain tests for G1 with empty param list (huggingface#1613)

* Revert "Update transformers tests generation util v4.45.2 (huggingface#1441)" (huggingface#1614)

This reverts commit 2ba520a.

* audio classification readme update (huggingface#1604)

* fix readme cmds for clip-roberta (huggingface#1603)

* fix readme cmds for clip-roberta

* comments and cleanup

* Fix run_generation test commands for TRL out usage example (huggingface#1624)

Fix run_generation example

* Add arbitrary scales (#15) (huggingface#1625)

Co-authored-by: Linoy Buchnik <[email protected]>

* Modify Qwen2 TRL command to avoid OOM.  (huggingface#1630)

Add --use_flash_attention to avoid OOM for Qwen2

* Replace the UNET custom attention processors (huggingface#1608)

Co-authored-by: Iman Gohari <[email protected]>

* Falcon Model Support (huggingface#1612)

Co-authored-by: leopck <[email protected]>
Co-authored-by: regisss <[email protected]>

* Update sdp_on_bf16 option for ST example (huggingface#1615)

* Update save lora weights for diffusers with text_encoder_2 layers (huggingface#1626)

* Fix `save_lora_weights` in `pipeline_utils.py` (huggingface#1643)

* Refactor mixtral moe block. (huggingface#1635)

* speech-recognition: downgrade datasets version (huggingface#1646)

* add sdp_on_bf16 to controlnet (huggingface#1631)

* add sdp_on_bf16 to controlnet

* Update pipeline_controlnet.py

pass sdp_on_bf16 to controlnet_pipeline

* Update text_to_image_generation.py

* Update text_to_image_generation.py

* Quick fix for quantization/custom op list loading (huggingface#1657)

Signed-off-by: Daniel Socek <[email protected]>

* Update multi-node test dockerfile (huggingface#1662)

* Fixes on OH 1.15 pre release (huggingface#1661)

Co-authored-by: regisss <[email protected]>

* Fix distributed issue for ST Trainer (huggingface#1649)

* Fix distributed issue for timm (huggingface#1653)

Co-authored-by: regisss <[email protected]>

* Added missing parameter for llama function call (huggingface#1663)

Co-authored-by: Libin Tang <[email protected]>

* Add reuse_cache for llama3-405b measurement (huggingface#1664)

* Update EFA dockerfile to SynapseAI 1.19.0 (huggingface#1665)

Co-authored-by: Libin Tang <[email protected]>

* Fix bug for GaudiMixtralAttentionLongSequence forward (huggingface#1650)

Signed-off-by: kaixuanliu <[email protected]>

* Update to SynapseAI v1.19

* Release: v1.15.0

* Fix style

* save_model - incorrect conflict resolution

* Fix style

---------

Signed-off-by: Wang, Yi A <[email protected]>
Signed-off-by: Ye, Xinyu <[email protected]>
Signed-off-by: Mengni Wang <[email protected]>
Signed-off-by: Daniel Socek <[email protected]>
Signed-off-by: Liu, Kaixuan <[email protected]>
Signed-off-by: Xin <[email protected]>
Signed-off-by: xinhe3 <[email protected]>
Signed-off-by: Daniel Huang <[email protected]>
Signed-off-by: yuanwu <[email protected]>
Signed-off-by: Haihao Xiang <[email protected]>
Signed-off-by: Matrix YAO <[email protected]>
Signed-off-by: Urszula Golowicz <[email protected]>
Signed-off-by: Focus Luo <[email protected]>
Signed-off-by: kaixuanliu <[email protected]>
Co-authored-by: Pramod Kumar <[email protected]>
Co-authored-by: Wang, Yi <[email protected]>
Co-authored-by: regisss <[email protected]>
Co-authored-by: Roi Tiefenbrunn <[email protected]>
Co-authored-by: Yan Tomsinsky <[email protected]>
Co-authored-by: Konrad Drozd <[email protected]>
Co-authored-by: Uri Livne <[email protected]>
Co-authored-by: Yeonsil Yoon <[email protected]>
Co-authored-by: Danny Semiat <[email protected]>
Co-authored-by: Yaser Afshar <[email protected]>
Co-authored-by: Harish Subramony <[email protected]>
Co-authored-by: Piotr Bielak <[email protected]>
Co-authored-by: Sayantan Sarkar <[email protected]>
Co-authored-by: Harish <[email protected]>
Co-authored-by: Libin Tang <[email protected]>
Co-authored-by: ZhengHongming888 <[email protected]>
Co-authored-by: Jimin Ha <[email protected]>
Co-authored-by: Seunghyuk Park (shepark) <[email protected]>
Co-authored-by: Dmitry <[email protected]>
Co-authored-by: Soila Kavulya <[email protected]>
Co-authored-by: Sun Choi <[email protected]>
Co-authored-by: xinhe <[email protected]>
Co-authored-by: Mohit Deopujari <[email protected]>
Co-authored-by: Iman Gohari <[email protected]>
Co-authored-by: XinyuYe-Intel <[email protected]>
Co-authored-by: Vivek Goel <[email protected]>
Co-authored-by: Akihiro Takahashi <[email protected]>
Co-authored-by: Miroslav Goncharenko <[email protected]>
Co-authored-by: Wang, Mengni <[email protected]>
Co-authored-by: Daniel Socek <[email protected]>
Co-authored-by: Vidya Galli <[email protected]>
Co-authored-by: deepak-gowda-narayana <[email protected]>
Co-authored-by: Supreet Singh <[email protected]>
Co-authored-by: kaixuanliu <[email protected]>
Co-authored-by: ANSHUMAN TRIPATHY <[email protected]>
Co-authored-by: sushil dubey <[email protected]>
Co-authored-by: Luca Calabria <[email protected]>
Co-authored-by: billishyahao <[email protected]>
Co-authored-by: xinhe3 <[email protected]>
Co-authored-by: KP (Edwin) Lau <[email protected]>
Co-authored-by: Marcin Łapiński <[email protected]>
Co-authored-by: Urszula Golowicz <[email protected]>
Co-authored-by: Greg Serochi <[email protected]>
Co-authored-by: Seethong Vang <[email protected]>
Co-authored-by: Anastasia Uvarova <[email protected]>
Co-authored-by: Mohit Deopujari <[email protected]>
Co-authored-by: Chen Levkovich <[email protected]>
Co-authored-by: Libin Tang <[email protected]>
Co-authored-by: ranzhejiang <[email protected]>
Co-authored-by: Baochen Yang <[email protected]>
Co-authored-by: Huijuan Zhou <[email protected]>
Co-authored-by: Sergey Plotnikov <[email protected]>
Co-authored-by: Deepak Narayana <[email protected]>
Co-authored-by: Witold Szczurek <[email protected]>
Co-authored-by: Wei Lin <[email protected]>
Co-authored-by: lkk <[email protected]>
Co-authored-by: Chaojun Zhang <[email protected]>
Co-authored-by: Daniel Huang <[email protected]>
Co-authored-by: Yuan Wu <[email protected]>
Co-authored-by: Xiang, Haihao <[email protected]>
Co-authored-by: Jianqian Zhou <[email protected]>
Co-authored-by: Wei Lin <[email protected]>
Co-authored-by: Thanaji Rao Thakkalapelli <[email protected]>
Co-authored-by: Yao Matrix <[email protected]>
Co-authored-by: yan tomsinsky <[email protected]>
Co-authored-by: Eran Geva <[email protected]>
Co-authored-by: Alexey Belyakov <[email protected]>
Co-authored-by: Bhargav <[email protected]>
Co-authored-by: Krzysztof Wiśniewski <[email protected]>
Co-authored-by: Abhilash Majumder <[email protected]>
Co-authored-by: FocusLuo <[email protected]>
Co-authored-by: Yixiu Chen <[email protected]>
Co-authored-by: Nariman Piroozan <[email protected]>
Co-authored-by: Edward Mascarenhas <[email protected]>
Co-authored-by: Shiv Kaul <[email protected]>
Co-authored-by: bmengke <[email protected]>
Co-authored-by: Leo Zhao <[email protected]>
Co-authored-by: Mohit Sinha <[email protected]>
Co-authored-by: Harshvardhan Chauhan <[email protected]>
Co-authored-by: Gustavo Malkomes <[email protected]>
Co-authored-by: Linoy Buchnik <[email protected]>
Co-authored-by: Alexey Fadeev <[email protected]>
Co-authored-by: leopck <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

run-test Run CI for PRs from external contributors

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants