Skip to content

Conversation

@magicheng0816
Copy link
Collaborator

No description provided.

@magicheng0816 magicheng0816 force-pushed the add_constrained_decoding branch from cd1fdce to e0d7b6a Compare December 3, 2025 08:29

// Input generated_token_list: [sequence_num][generated_token_ids]
// Output: mask tensor[sequence_num,vocab_size]
virtual torch::Tensor generate_mask(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There'll be a data copy when calling this function, right? if it's heavy, it should be avoided.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no heavy data copying. First, an initialized tensor is generated on the device side based on the vocab size. Then, a valid token index set is dynamically obtained from the already generated tokens on the host side, copied to the device side, and the initialized tensor is modified in-place to form a mask.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants