Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions vllm_omni/diffusion/models/glm_image/pipeline_glm_image.py
Original file line number Diff line number Diff line change
Expand Up @@ -491,7 +491,7 @@ def generate_prior_tokens(
condition_grid = image_grid_thw[:-1]
prior_token_image_embed = self.vision_language_encoder.get_image_features(
inputs["pixel_values"], condition_grid
)
).pooler_output
prior_token_image_embed = torch.cat(prior_token_image_embed, dim=0)
Comment on lines 491 to 495

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Remove torch.cat on pooled image features tensor

With the new .pooler_output access, prior_token_image_embed is a single tensor (the pooled image features). torch.cat(prior_token_image_embed, dim=0) now raises TypeError: cat() received an invalid combination of arguments because torch.cat requires a sequence of tensors, not a tensor. This will crash image-edit requests that include condition images (the only path where this block runs). Consider using the tensor directly (or wrapping it in a list only if you truly need to concatenate multiple tensors).

Useful? React with 👍 / 👎.

flat_prior_token_image_ids = self.vision_language_encoder.get_image_tokens(
prior_token_image_embed, condition_grid
Expand Down Expand Up @@ -859,7 +859,7 @@ def forward(self, req: OmniDiffusionRequest) -> DiffusionOutput:
preprocessed_images = (
None
if isinstance(first_prompt, str)
else first_prompt.get("additional_information", {}).get("preprocessed_image")
else [first_prompt.get("additional_information", {}).get("preprocessed_image")]
)
condition_images = (
None
Expand Down