Skip to content

Conversation

@daquexian
Copy link
Contributor

@daquexian daquexian commented Jul 25, 2023

2080 1.5B model, cuda fp16:

main this PR
one mode 0.3496s 0.2191s
seq mode 0.0202s 0.0123s

GPU memory usage keeps the same with main branch.

daquexian added 18 commits July 28, 2023 20:18
Signed-off-by: daquexian <[email protected]>
Signed-off-by: daquexian <[email protected]>
Signed-off-by: daquexian <[email protected]>
Signed-off-by: daquexian <[email protected]>
Signed-off-by: daquexian <[email protected]>
Signed-off-by: daquexian <[email protected]>
Signed-off-by: daquexian <[email protected]>
Signed-off-by: daquexian <[email protected]>
Signed-off-by: daquexian <[email protected]>
Signed-off-by: daquexian <[email protected]>
Signed-off-by: daquexian <[email protected]>
Signed-off-by: daquexian <[email protected]>
Signed-off-by: daquexian <[email protected]>
Signed-off-by: daquexian <[email protected]>
Signed-off-by: daquexian <[email protected]>
Signed-off-by: daquexian <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant