-
Notifications
You must be signed in to change notification settings - Fork 2.4k
feat: Add NVIDIA triton trt-llm extension #888
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
There is a current blockage is the error on Triton inference server + trt llm with missing space character: triton-inference-server/tensorrtllm_backend#34 |
481abc4 to
774b122
Compare
774b122 to
d7c0d97
Compare
|
What's the rationale for having both |
d29ef17 to
f9e73b0
Compare
|
No, the |
194132d to
fc8057b
Compare
0cd4106 to
4054d77
Compare
06d46cb to
f26a8d8
Compare
tikikun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
extensions/inference-triton-trtllm-extension/src/@types/global.d.ts
Outdated
Show resolved
Hide resolved
f26a8d8 to
587f5ad
Compare




For #821
Integration diagram

NVIDIA triton inference server and TensorRT LLM setup
triton-inference-clusterusing Helm on Kubernetes on DGX clusters