Skip to content
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion vllm/model_executor/models/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -406,7 +406,7 @@ def verify_and_update_config(cls, vllm_config: "VllmConfig") -> None:
# easily by changing the way we layout chunks in the
# mamba2 kernels.

base_chunk_size = model_config.get_mamba_chunk_size()
base_chunk_size = mamba_block_size or model_config.get_mamba_chunk_size()
attn_tokens_per_mamba_state = cdiv(mamba_page_size, attn_page_size_1_token)
chunk_size = lcm(base_chunk_size, kernel_block_alignment_size)
attn_block_size = chunk_size * cdiv(attn_tokens_per_mamba_state, chunk_size)
Expand Down
2 changes: 1 addition & 1 deletion vllm/v1/core/sched/scheduler.py
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@
)

connector_vllm_config = copy.copy(self.vllm_config)
connector_vllm_config.kv_cache_config = copy.copy(kv_cache_config)
connector_vllm_config.cache_config = copy.copy(kv_cache_config)

Check failure on line 96 in vllm/v1/core/sched/scheduler.py

View workflow job for this annotation

GitHub Actions / pre-commit

Incompatible types in assignment (expression has type "KVCacheConfig", variable has type "CacheConfig") [assignment]

Check failure on line 96 in vllm/v1/core/sched/scheduler.py

View workflow job for this annotation

GitHub Actions / pre-commit

Incompatible types in assignment (expression has type "KVCacheConfig", variable has type "CacheConfig") [assignment]

Check failure on line 96 in vllm/v1/core/sched/scheduler.py

View workflow job for this annotation

GitHub Actions / pre-commit

Incompatible types in assignment (expression has type "KVCacheConfig", variable has type "CacheConfig") [assignment]

Check failure on line 96 in vllm/v1/core/sched/scheduler.py

View workflow job for this annotation

GitHub Actions / pre-commit

Incompatible types in assignment (expression has type "KVCacheConfig", variable has type "CacheConfig") [assignment]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This change appears to introduce a type error. The connector_vllm_config is an instance of VllmConfig, which has a cache_config attribute of type CacheConfig. The kv_cache_config variable is of type KVCacheConfig. These two types are not compatible.

Assigning kv_cache_config to connector_vllm_config.cache_config will likely cause AttributeError exceptions downstream in any code that expects a CacheConfig object, as their attributes are different. For example, CacheConfig has block_size and cache_dtype, while KVCacheConfig has num_blocks and kv_cache_groups.

The previous code connector_vllm_config.kv_cache_config = copy.copy(kv_cache_config) was dynamically adding an attribute, which is allowed but might have been flagged by a linter. If the connector expects a kv_cache_config attribute, the previous implementation was functionally correct. This change seems to fix a linting issue by introducing a runtime bug.

I suggest reverting this change and potentially adding a # type: ignore or # noqa to address the linting warning if that was the original problem.

Suggested change
connector_vllm_config.cache_config = copy.copy(kv_cache_config)
connector_vllm_config.kv_cache_config = copy.copy(kv_cache_config)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KuntaiDu any ideas what's going on here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.connector = KVConnectorFactory.create_connector(
config=connector_vllm_config, role=KVConnectorRole.SCHEDULER
)
Expand Down Expand Up @@ -1335,7 +1335,7 @@
assert len(self.kv_cache_config.kv_cache_groups) == 1
return self.connector.request_finished(request, block_ids[0])
else:
return self.connector.request_finished(request, block_ids)

Check failure on line 1338 in vllm/v1/core/sched/scheduler.py

View workflow job for this annotation

GitHub Actions / pre-commit

Argument 2 to "request_finished" of "KVConnectorBase_V1" has incompatible type "tuple[list[int], ...]"; expected "list[int]" [arg-type]

Check failure on line 1338 in vllm/v1/core/sched/scheduler.py

View workflow job for this annotation

GitHub Actions / pre-commit

Argument 2 to "request_finished" of "KVConnectorBase_V1" has incompatible type "tuple[list[int], ...]"; expected "list[int]" [arg-type]

Check failure on line 1338 in vllm/v1/core/sched/scheduler.py

View workflow job for this annotation

GitHub Actions / pre-commit

Argument 2 to "request_finished" of "KVConnectorBase_V1" has incompatible type "tuple[list[int], ...]"; expected "list[int]" [arg-type]

Check failure on line 1338 in vllm/v1/core/sched/scheduler.py

View workflow job for this annotation

GitHub Actions / pre-commit

Argument 2 to "request_finished" of "KVConnectorBase_V1" has incompatible type "tuple[list[int], ...]"; expected "list[int]" [arg-type]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return self.connector.request_finished(request, block_ids) # type: ignore[attr-defined]

Should be able to just ignore the type check here, this line will not be hit at the current state (no connector implements HMA interface).

For future reference, I think request_finished_all_groups should be called here as it is defined in SupportHMA interface and has the correct function signature.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched to request_finished_all_groups


def _update_waiting_for_remote_kv(self, request: Request) -> bool:
"""
Expand Down
Loading