[Bug]: ERROR 03-11 07:47:00 [engine.py:141] AttributeError: Invalid attention type encoder-only

### Your current environment

<details>
<summary>The output of `python collect_env.py`</summary>

```text
Your output of `python collect_env.py` here
```
INFO 03-11 07:49:54 [__init__.py:256] Automatically detected platform rocm.
Collecting environment information...
PyTorch version: 2.7.0.dev20250309+rocm6.3
Is debug build: False
CUDA used to build PyTorch: N/A
ROCM used to build PyTorch: 6.3.42131-fa1d09cbd

OS: Fedora Linux 40 (Server Edition) (x86_64)
GCC version: (GCC) 14.2.1 20240912 (Red Hat 14.2.1-3)
Clang version: 18.0.0git (https://github.com/RadeonOpenCompute/llvm-project roc-6.3.3 25012 e5bf7e55c91490b07c49d8960fa7983d864936c4)
CMake version: version 3.31.2
Libc version: glibc-2.39

Python version: 3.12.8 | packaged by Anaconda, Inc. | (main, Dec 11 2024, 16:31:09) [GCC 11.2.0] (64-bit runtime)
Python platform: Linux-6.13.5-100.fc40.x86_64-x86_64-with-glibc2.39
Is CUDA available: True
CUDA runtime version: Could not collect
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: AMD Instinct MI100 (gfx908:sramecc+:xnack-)
Nvidia driver version: Could not collect
cuDNN version: Could not collect
HIP runtime version: 6.3.42131
MIOpen runtime version: 3.3.0
Is XNNPACK available: True

CPU:
Architecture:                         x86_64
CPU op-mode(s):                       32-bit, 64-bit
Address sizes:                        48 bits physical, 48 bits virtual
Byte Order:                           Little Endian
CPU(s):                               128
On-line CPU(s) list:                  0-127
Vendor ID:                            AuthenticAMD
Model name:                           AMD EPYC 7C13 64-Core Processor
CPU family:                           25
Model:                                1
Thread(s) per core:                   1
Core(s) per socket:                   64
Socket(s):                            2
Stepping:                             1
Frequency boost:                      enabled
CPU(s) scaling MHz:                   45%
CPU max MHz:                          3720.0000
CPU min MHz:                          400.0000
BogoMIPS:                             4000.34
Flags:                                fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local user_shstk clzero irperf xsaveerptr rdpru wbnoinvd amd_ppin brs arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold v_vmsave_vmload vgif v_spec_ctrl umip pku ospke vaes vpclmulqdq rdpid overflow_recov succor smca fsrm debug_swap
Virtualization:                       AMD-V
L1d cache:                            4 MiB (128 instances)
L1i cache:                            4 MiB (128 instances)
L2 cache:                             64 MiB (128 instances)
L3 cache:                             512 MiB (16 instances)
NUMA node(s):                         2
NUMA node0 CPU(s):                    0-63
NUMA node1 CPU(s):                    64-127
Vulnerability Gather data sampling:   Not affected
Vulnerability Itlb multihit:          Not affected
Vulnerability L1tf:                   Not affected
Vulnerability Mds:                    Not affected
Vulnerability Meltdown:               Not affected
Vulnerability Mmio stale data:        Not affected
Vulnerability Reg file data sampling: Not affected
Vulnerability Retbleed:               Not affected
Vulnerability Spec rstack overflow:   Mitigation; Safe RET
Vulnerability Spec store bypass:      Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:             Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:             Mitigation; Retpolines; IBPB conditional; IBRS_FW; STIBP disabled; RSB filling; PBRSB-eIBRS Not affected; BHI Not affected
Vulnerability Srbds:                  Not affected
Vulnerability Tsx async abort:        Not affected

Versions of relevant libraries:
[pip3] mypy-extensions==1.0.0
[pip3] numpy==1.26.4
[pip3] nvidia-cublas-cu12==12.4.5.8
[pip3] nvidia-cuda-cupti-cu12==12.4.127
[pip3] nvidia-cuda-nvrtc-cu12==12.4.127
[pip3] nvidia-cuda-runtime-cu12==12.4.127
[pip3] nvidia-cudnn-cu12==9.1.0.70
[pip3] nvidia-cufft-cu12==11.2.1.3
[pip3] nvidia-curand-cu12==10.3.5.147
[pip3] nvidia-cusolver-cu12==11.6.1.9
[pip3] nvidia-cusparse-cu12==12.3.1.170
[pip3] nvidia-cusparselt-cu12==0.6.2
[pip3] nvidia-ml-py==12.560.30
[pip3] nvidia-nccl-cu12==2.21.5
[pip3] nvidia-nvjitlink-cu12==12.4.127
[pip3] nvidia-nvtx-cu12==12.4.127
[pip3] optree==0.14.1
[pip3] pytorch-triton-rocm==3.2.0+git4b3bb1f8
[pip3] pyzmq==26.2.1
[pip3] torch==2.7.0.dev20250309+rocm6.3
[pip3] torchaudio==2.6.0.dev20250309+rocm6.3
[pip3] torchvision==0.22.0.dev20250309+rocm6.3
[pip3] transformers==4.49.0
[pip3] triton==3.0.0
[conda] numpy                     1.26.4                   pypi_0    pypi
[conda] nvidia-cublas-cu12        12.4.5.8                 pypi_0    pypi
[conda] nvidia-cuda-cupti-cu12    12.4.127                 pypi_0    pypi
[conda] nvidia-cuda-nvrtc-cu12    12.4.127                 pypi_0    pypi
[conda] nvidia-cuda-runtime-cu12  12.4.127                 pypi_0    pypi
[conda] nvidia-cudnn-cu12         9.1.0.70                 pypi_0    pypi
[conda] nvidia-cufft-cu12         11.2.1.3                 pypi_0    pypi
[conda] nvidia-curand-cu12        10.3.5.147               pypi_0    pypi
[conda] nvidia-cusolver-cu12      11.6.1.9                 pypi_0    pypi
[conda] nvidia-cusparse-cu12      12.3.1.170               pypi_0    pypi
[conda] nvidia-cusparselt-cu12    0.6.2                    pypi_0    pypi
[conda] nvidia-ml-py              12.560.30                pypi_0    pypi
[conda] nvidia-nccl-cu12          2.21.5                   pypi_0    pypi
[conda] nvidia-nvjitlink-cu12     12.4.127                 pypi_0    pypi
[conda] nvidia-nvtx-cu12          12.4.127                 pypi_0    pypi
[conda] optree                    0.14.1                   pypi_0    pypi
[conda] pytorch-triton-rocm       3.2.0+git4b3bb1f8          pypi_0    pypi
[conda] pyzmq                     26.2.1                   pypi_0    pypi
[conda] torch                     2.7.0.dev20250309+rocm6.3          pypi_0    pypi
[conda] torchaudio                2.6.0.dev20250309+rocm6.3          pypi_0    pypi
[conda] torchvision               0.22.0.dev20250309+rocm6.3          pypi_0    pypi
[conda] transformers              4.49.0                   pypi_0    pypi
[conda] triton                    3.0.0                    pypi_0    pypi
ROCM Version: 6.3.42134-a9a80e791
Neuron SDK Version: N/A
vLLM Version: 0.7.4.dev340+gdc74613f.d20250310
vLLM Build Flags:
CUDA Archs: Not Set; ROCm: Disabled; Neuron: Disabled
GPU Topology:
============================ ROCm System Management Interface ============================
================================ Weight between two GPUs =================================
       GPU0         GPU1         
GPU0   0            40           
GPU1   40           0            

================================= Hops between two GPUs ==================================
       GPU0         GPU1         
GPU0   0            2            
GPU1   2            0            

=============================== Link Type between two GPUs ===============================
       GPU0         GPU1         
GPU0   0            PCIE         
GPU1   PCIE         0            

======================================= Numa Nodes =======================================
GPU[0]		: (Topology) Numa Node: 1
GPU[0]		: (Topology) Numa Affinity: 1
GPU[1]		: (Topology) Numa Node: 1
GPU[1]		: (Topology) Numa Affinity: 1
================================== End of ROCm SMI Log ===================================

PYTORCH_ROCM_ARCH=gfx908
VLLM_TARGET_DEVICE=rocm
LD_LIBRARY_PATH=/data01/anaconda3/envs/vllm/lib/python3.12/site-packages/cv2/../../lib64:/usr/lib64:/usr/local/lib:/opt/rocm-6.3.3/lib:
PYTORCH_ROCM_DISABLE_HIPBLASLT=1
VLLM_USE_TRITON_FLASH_ATTN=0
NCCL_CUMEM_ENABLE=0
TORCHINDUCTOR_COMPILE_THREADS=1
CUDA_MODULE_LOADING=LAZY
</details>


### 🐛 Describe the bug

ERROR 03-11 07:47:00 [engine.py:141] AttributeError('Invalid attention type encoder_only')
ERROR 03-11 07:47:00 [engine.py:141] Traceback (most recent call last):
ERROR 03-11 07:47:00 [engine.py:141]   File "/home/radmin/gitlab/vllm/vllm/engine/multiprocessing/engine.py", line 139, in start
ERROR 03-11 07:47:00 [engine.py:141]     self.run_engine_loop()
ERROR 03-11 07:47:00 [engine.py:141]   File "/home/radmin/gitlab/vllm/vllm/engine/multiprocessing/engine.py", line 202, in run_engine_loop
ERROR 03-11 07:47:00 [engine.py:141]     request_outputs = self.engine_step()
ERROR 03-11 07:47:00 [engine.py:141]                       ^^^^^^^^^^^^^^^^^^
ERROR 03-11 07:47:00 [engine.py:141]   File "/home/radmin/gitlab/vllm/vllm/engine/multiprocessing/engine.py", line 228, in engine_step
ERROR 03-11 07:47:00 [engine.py:141]     raise e
ERROR 03-11 07:47:00 [engine.py:141]   File "/home/radmin/gitlab/vllm/vllm/engine/multiprocessing/engine.py", line 211, in engine_step
ERROR 03-11 07:47:00 [engine.py:141]     return self.engine.step()
ERROR 03-11 07:47:00 [engine.py:141]            ^^^^^^^^^^^^^^^^^^
ERROR 03-11 07:47:00 [engine.py:141]   File "/home/radmin/gitlab/vllm/vllm/engine/llm_engine.py", line 1407, in step
ERROR 03-11 07:47:00 [engine.py:141]     outputs = self.model_executor.execute_model(
ERROR 03-11 07:47:00 [engine.py:141]               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-11 07:47:00 [engine.py:141]   File "/home/radmin/gitlab/vllm/vllm/executor/executor_base.py", line 139, in execute_model
ERROR 03-11 07:47:00 [engine.py:141]     output = self.collective_rpc("execute_model",
ERROR 03-11 07:47:00 [engine.py:141]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-11 07:47:00 [engine.py:141]   File "/home/radmin/gitlab/vllm/vllm/executor/uniproc_executor.py", line 56, in collective_rpc
ERROR 03-11 07:47:00 [engine.py:141]     answer = run_method(self.driver_worker, method, args, kwargs)
ERROR 03-11 07:47:00 [engine.py:141]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-11 07:47:00 [engine.py:141]   File "/home/radmin/gitlab/vllm/vllm/utils.py", line 2238, in run_method
ERROR 03-11 07:47:00 [engine.py:141]     return func(*args, **kwargs)
ERROR 03-11 07:47:00 [engine.py:141]            ^^^^^^^^^^^^^^^^^^^^^
ERROR 03-11 07:47:00 [engine.py:141]   File "/home/radmin/gitlab/vllm/vllm/worker/worker_base.py", line 420, in execute_model
ERROR 03-11 07:47:00 [engine.py:141]     output = self.model_runner.execute_model(
ERROR 03-11 07:47:00 [engine.py:141]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-11 07:47:00 [engine.py:141]   File "/data01/anaconda3/envs/vllm/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
ERROR 03-11 07:47:00 [engine.py:141]     return func(*args, **kwargs)
ERROR 03-11 07:47:00 [engine.py:141]            ^^^^^^^^^^^^^^^^^^^^^
ERROR 03-11 07:47:00 [engine.py:141]   File "/home/radmin/gitlab/vllm/vllm/worker/pooling_model_runner.py", line 111, in execute_model
ERROR 03-11 07:47:00 [engine.py:141]     hidden_or_intermediate_states = model_executable(
ERROR 03-11 07:47:00 [engine.py:141]                                     ^^^^^^^^^^^^^^^^^
ERROR 03-11 07:47:00 [engine.py:141]   File "/data01/anaconda3/envs/vllm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
ERROR 03-11 07:47:00 [engine.py:141]     return self._call_impl(*args, **kwargs)
ERROR 03-11 07:47:00 [engine.py:141]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-11 07:47:00 [engine.py:141]   File "/data01/anaconda3/envs/vllm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
ERROR 03-11 07:47:00 [engine.py:141]     return forward_call(*args, **kwargs)
ERROR 03-11 07:47:00 [engine.py:141]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-11 07:47:00 [engine.py:141]   File "/home/radmin/gitlab/vllm/vllm/model_executor/models/bert.py", line 414, in forward
ERROR 03-11 07:47:00 [engine.py:141]     return self.model(input_ids=input_ids,
ERROR 03-11 07:47:00 [engine.py:141]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-11 07:47:00 [engine.py:141]   File "/data01/anaconda3/envs/vllm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
ERROR 03-11 07:47:00 [engine.py:141]     return self._call_impl(*args, **kwargs)
ERROR 03-11 07:47:00 [engine.py:141]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-11 07:47:00 [engine.py:141]   File "/data01/anaconda3/envs/vllm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
ERROR 03-11 07:47:00 [engine.py:141]     return forward_call(*args, **kwargs)
ERROR 03-11 07:47:00 [engine.py:141]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-11 07:47:00 [engine.py:141]   File "/home/radmin/gitlab/vllm/vllm/model_executor/models/bert.py", line 349, in forward
ERROR 03-11 07:47:00 [engine.py:141]     return self.encoder(hidden_states)
ERROR 03-11 07:47:00 [engine.py:141]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-11 07:47:00 [engine.py:141]   File "/home/radmin/gitlab/vllm/vllm/compilation/decorators.py", line 172, in __call__
ERROR 03-11 07:47:00 [engine.py:141]     return self.forward(*args, **kwargs)
ERROR 03-11 07:47:00 [engine.py:141]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-11 07:47:00 [engine.py:141]   File "/home/radmin/gitlab/vllm/vllm/model_executor/models/bert.py", line 119, in forward
ERROR 03-11 07:47:00 [engine.py:141]     hidden_states = layer(hidden_states)
ERROR 03-11 07:47:00 [engine.py:141]                     ^^^^^^^^^^^^^^^^^^^^
ERROR 03-11 07:47:00 [engine.py:141]   File "/data01/anaconda3/envs/vllm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
ERROR 03-11 07:47:00 [engine.py:141]     return self._call_impl(*args, **kwargs)
ERROR 03-11 07:47:00 [engine.py:141]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-11 07:47:00 [engine.py:141]   File "/data01/anaconda3/envs/vllm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
ERROR 03-11 07:47:00 [engine.py:141]     return forward_call(*args, **kwargs)
ERROR 03-11 07:47:00 [engine.py:141]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-11 07:47:00 [engine.py:141]   File "/home/radmin/gitlab/vllm/vllm/model_executor/models/bert.py", line 154, in forward
ERROR 03-11 07:47:00 [engine.py:141]     attn_output = self.attention(hidden_states)
ERROR 03-11 07:47:00 [engine.py:141]                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-11 07:47:00 [engine.py:141]   File "/data01/anaconda3/envs/vllm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
ERROR 03-11 07:47:00 [engine.py:141]     return self._call_impl(*args, **kwargs)
ERROR 03-11 07:47:00 [engine.py:141]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-11 07:47:00 [engine.py:141]   File "/data01/anaconda3/envs/vllm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
ERROR 03-11 07:47:00 [engine.py:141]     return forward_call(*args, **kwargs)
ERROR 03-11 07:47:00 [engine.py:141]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-11 07:47:00 [engine.py:141]   File "/home/radmin/gitlab/vllm/vllm/model_executor/models/bert.py", line 188, in forward
ERROR 03-11 07:47:00 [engine.py:141]     self_output = self.self(hidden_states)
ERROR 03-11 07:47:00 [engine.py:141]                   ^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-11 07:47:00 [engine.py:141]   File "/data01/anaconda3/envs/vllm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
ERROR 03-11 07:47:00 [engine.py:141]     return self._call_impl(*args, **kwargs)
ERROR 03-11 07:47:00 [engine.py:141]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-11 07:47:00 [engine.py:141]   File "/data01/anaconda3/envs/vllm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
ERROR 03-11 07:47:00 [engine.py:141]     return forward_call(*args, **kwargs)
ERROR 03-11 07:47:00 [engine.py:141]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-11 07:47:00 [engine.py:141]   File "/home/radmin/gitlab/vllm/vllm/model_executor/models/bert.py", line 243, in forward
ERROR 03-11 07:47:00 [engine.py:141]     output = self.attn(q, k, v)
ERROR 03-11 07:47:00 [engine.py:141]              ^^^^^^^^^^^^^^^^^^
ERROR 03-11 07:47:00 [engine.py:141]   File "/data01/anaconda3/envs/vllm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
ERROR 03-11 07:47:00 [engine.py:141]     return self._call_impl(*args, **kwargs)
ERROR 03-11 07:47:00 [engine.py:141]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-11 07:47:00 [engine.py:141]   File "/data01/anaconda3/envs/vllm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
ERROR 03-11 07:47:00 [engine.py:141]     return forward_call(*args, **kwargs)
ERROR 03-11 07:47:00 [engine.py:141]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-11 07:47:00 [engine.py:141]   File "/home/radmin/gitlab/vllm/vllm/attention/layer.py", line 225, in forward
ERROR 03-11 07:47:00 [engine.py:141]     return torch.ops.vllm.unified_attention(
ERROR 03-11 07:47:00 [engine.py:141]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-11 07:47:00 [engine.py:141]   File "/data01/anaconda3/envs/vllm/lib/python3.12/site-packages/torch/_ops.py", line 1158, in __call__
ERROR 03-11 07:47:00 [engine.py:141]     return self._op(*args, **(kwargs or {}))
ERROR 03-11 07:47:00 [engine.py:141]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-11 07:47:00 [engine.py:141]   File "/home/radmin/gitlab/vllm/vllm/attention/layer.py", line 331, in unified_attention
ERROR 03-11 07:47:00 [engine.py:141]     return self.impl.forward(self, query, key, value, kv_cache, attn_metadata)
ERROR 03-11 07:47:00 [engine.py:141]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-11 07:47:00 [engine.py:141]   File "/home/radmin/gitlab/vllm/vllm/attention/backends/rocm_flash_attn.py", line 667, in forward
ERROR 03-11 07:47:00 [engine.py:141]     causal_mask) = _get_seq_len_block_table_args(
ERROR 03-11 07:47:00 [engine.py:141]                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-11 07:47:00 [engine.py:141]   File "/home/radmin/gitlab/vllm/vllm/attention/backends/rocm_flash_attn.py", line 421, in _get_seq_len_block_table_args
ERROR 03-11 07:47:00 [engine.py:141]     raise AttributeError(f"Invalid attention type {str(attn_type)}")
ERROR 03-11 07:47:00 [engine.py:141] AttributeError: Invalid attention type encoder_only
...
INFO 03-11 07:47:00 [logger.py:39] Received request embd-57e89ad988d14a9894249ce553985904-0: prompt: 'Thaa های тоlen loten diている oड्या suaલા16 ju ме in ac4%õ gia למ செல்லጉ gia למDEESTAכך sind高ronani till els sua су 20 jaunung r habari:1000larирање í është nuóła tubuha බව tillσεa är dekora های то mássang ги habari:1000e pour X可 habari:1000í های4 ke And Gι-им under pesquisa گفتπληf το viva jenαць<pad> nu vi sua untuk Si като ж sunt r inteیل nu меня vi敬 çoxité í化 keLA video پاکستان2和城 З R-имego pesquisa گفتπληf το viva მაць<pad> nu vi sua untuk Si като ж sunt r inteیل nuadi ; vi敬 çox програм2和城 З 8 եւимവര് bod” روی τοва न ke Leroep degreette 8 bona敬tima2 15 למ estasге可五eσεa های то más MV nu viство dụnge ved Gο asi ke技术ı sua aðı可 frálar Д तक和 အ кога vi可 frá keимتى wyborબကိုS למDEnet pasJAड्या apstkelA pasJAड्या apqa pesquisa گفت eनेa پاکستان pesquisa گفت e Toaint bod eő://kel ಇಲ್ಲ 그第二 Glückמיילa hafaට preocupaמייל так Також e`a還有 még को retzaο Le vilበል una triება ihana ärı可 fráe ved сува más viим uygun”ʻ nu viим歐 wybor dari keyn vi suaى frá ke NUe Wij团a viимego pesquisa گفت” fånuಲಿ Fed vi NU 문제им не” nuим歐تtigeσεa های اصلی vi suangимego pesquisa گفت e To”نLAበል ärим не e tri”ება ihan’f τοابון2 lo काम Anton部署njanı vi fărălarand vi çox perheधे videotul програмovi นาย Glücka nhưngට preocupaA suaseid үндэсний many第二 Glückמיילa hafaට preocupaמייל так ta e tri://</s>եց всеот นาย Glücka nhưngට preocupaAEtgrip اشاره 그 นาย Glücka nhưngට preocupa دی', params: PoolingParams(additional_metadata=None), prompt_token_ids: [5971, 11, 584, 690, 1977, 459, 510, 45, 7826, 36, 50207, 1646, 9725, 2485, 1129, 3928, 23, 1030, 11267, 4340, 3529, 7183, 80472, 11959, 3529, 7183, 8399, 55669, 77432, 1276, 1395, 1900, 2628, 570, 1115, 1646, 649, 387, 33519, 449, 1690, 17508, 12, 14105, 320, 21470, 439, 1396, 315, 41050, 11, 13931, 11, 5099, 570, 5810, 11, 369, 40075, 11, 584, 690, 1005, 433, 449, 1670, 17508, 12, 14105, 13, 578, 1193, 1403, 17508, 12, 14105, 430, 584, 617, 311, 3493, 527, 1473, 9, 1595, 1379, 31639, 5228, 45722, 420, 374, 279, 330, 7349, 1445, 3321, 1, 315, 279, 1646, 482, 602, 1770, 2637, 1268, 1690, 892, 7504, 315, 3925, 279, 30828, 4009, 5097, 439, 1988, 311, 8356, 1202, 2612, 304, 264, 4741, 1522, 627, 9, 1595, 3081, 31639, 5228, 45722, 420, 374, 279, 330, 13741, 3321, 1, 315, 279, 1646, 482, 602, 1770, 2637, 1268, 1690, 892, 7504, 315, 3938, 2819, 279, 30828, 4009, 16674, 304, 264, 4741, 1522, 382, 791, 1595, 11719, 4486, 63, 5852, 374, 1120, 1618, 311, 636, 53823, 79385, 3135, 382, 13622, 30828, 14488, 304, 423, 7183, 1397, 1521, 1403, 5137, 13, 5810, 11, 584, 690, 1005, 66160, 315, 279, 3280, 2786, 13, 1226, 527, 1457, 5644, 311, 5052, 1057, 1646, 389, 1057, 1403, 4101, 320, 1729, 7231, 264, 1160, 8649, 279, 1403, 4101, 311, 1595, 6410, 55358, 7887, 1527, 294, 7183, 8399, 1179, 452, 11855, 50207, 1747, 271, 2590, 284, 452, 11855, 50207, 1747, 5498, 31639, 5228, 28, 1187, 11, 2612, 31639, 5228, 28, 717, 11, 4288, 4486, 28, 2983, 696, 2590, 21529, 2625, 10613, 51927, 62815, 11, 5542, 722, 34263, 62815, 1145, 40446, 28, 1135, 11, 14008, 3702, 629, 10267, 596, 1457, 636, 1063, 51165, 220, 1927, 4038, 8469, 11, 369, 1057, 1403, 4101, 13, 1226, 649, 1120, 1005, 279, 1595, 20473, 63, 5811, 315, 279, 1595, 35798, 55358, 734, 311, 3371, 279, 1646, 902, 4101, 311, 18057, 13, 13516, 18007, 11, 279, 1595, 3081, 31639, 5228, 63, 1587, 539, 6089, 80799, 279, 18057, 35174, 1595, 77, 63, 315, 1595, 35798, 368, 29687, 5810, 11, 584, 16572, 279, 1646, 449, 1595, 3081, 31639, 5228, 28, 717, 63, 323, 8356, 51165, 369, 1595, 77, 28, 1927, 63, 4038, 8469, 26, 420, 374, 5042, 2884, 304, 459, 3313, 33263, 49053, 1648, 4920, 279, 16451, 320, 2940, 279, 4009, 53947, 60606, 1202, 3766, 16674, 3677, 24361, 51927, 11, 4255, 722, 34263, 284, 1646, 24706, 71951, 5941, 10613, 51927, 62815, 11, 5542, 722, 34263, 62815, 1145, 308, 28, 1927, 696, 2, 5569, 1203, 512, 24361, 51927, 11, 4255, 722, 34263, 284, 69824, 77663, 18956, 2625, 24361, 51927, 11, 4255, 722, 34263, 2526], lora_request: None, prompt_adapter_request: None.
CRITICAL 03-11 07:47:00 [launcher.py:116] MQLLMEngine is already dead, terminating server process
INFO:     192.168.50.87:49810 - "POST /v1/embeddings HTTP/1.1" 500 Internal Server Error


### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug]: ERROR 03-11 07:47:00 [engine.py:141] AttributeError: Invalid attention type encoder-only #14583

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: ERROR 03-11 07:47:00 [engine.py:141] AttributeError: Invalid attention type encoder-only #14583

Description

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions