Skip to content

Conversation

@WoosukKwon
Copy link
Collaborator

Example usage:

  • Generating a single completion: python benchmark/benchmark_text_completion.py --dataset alpaca_opt_text_completion.pkl --model facebook/opt-13b --request-rate 1.0 --duration 3600 --n1 1.0
  • Generating two completions in parallel: python benchmark/benchmark_text_completion.py --dataset alpaca_opt_text_completion.pkl --model facebook/opt-13b --request-rate 1.0 --duration 3600 --n2 1.0
  • Generating two completions with beam search: python benchmark/benchmark_text_completion.py --dataset alpaca_opt_text_completion.pkl --model facebook/opt-13b --request-rate 1.0 --duration 3600 --n2-beam 1.0

@WoosukKwon WoosukKwon requested a review from zhuohan123 April 6, 2023 09:46
@WoosukKwon
Copy link
Collaborator Author

Merging this PR to main, as we have too many branches.

@WoosukKwon WoosukKwon merged commit 84eee24 into main Apr 12, 2023
@WoosukKwon WoosukKwon deleted the experiment branch April 12, 2023 22:04
slyalin pushed a commit to slyalin/vllm that referenced this pull request Apr 19, 2024
…ce_artifacts

Revert "Produce artifacts for bare metal installation in Dockerfile.openvino"
dtrifiro pushed a commit to dtrifiro/vllm that referenced this pull request May 21, 2024
This PR logs all errors during validation or generation
for a request like TGIS does. 

Signed-off-by: Joe Runde <[email protected]>
z103cb pushed a commit to dtrifiro/vllm that referenced this pull request May 21, 2024
…ensions

Dockerfile.ubi: get rid of prebuilt-wheel stage
tianyil1 pushed a commit to tianyil1/vllm that referenced this pull request Jun 5, 2024
…um_wa

WA: Disable cumsum in HPU _prepare_prompt
fxmarty pushed a commit to fxmarty/vllm-public that referenced this pull request Jun 12, 2024
@alixiaodi alixiaodi mentioned this pull request Aug 2, 2024
wuhuikx pushed a commit to wuhuikx/vllm that referenced this pull request Mar 27, 2025
Some PR for plugin support is not merged by vllm yet. This PR add monkey
patch to vllm-ascend to make vllm-ascend work with vllm directly.

This patch code should be removed once the related function is supported
by vllm originally.

Signed-off-by: wangxiyuan <[email protected]>
zyongye added a commit to zyongye/vllm that referenced this pull request Aug 5, 2025
Signed-off-by: simon-mo <[email protected]>
Co-authored-by: simon-mo <[email protected]>
zyongye added a commit to zyongye/vllm that referenced this pull request Aug 6, 2025
Signed-off-by: simon-mo <[email protected]>
Co-authored-by: simon-mo <[email protected]>
heheda12345 pushed a commit to heheda12345/vllm that referenced this pull request Sep 29, 2025
inkcherry pushed a commit to inkcherry/vllm that referenced this pull request Nov 6, 2025
dik654 pushed a commit to dik654/vllm-for-study that referenced this pull request Nov 18, 2025
New Industry Use Cases (vllm-project#21-30):
- vllm-project#21 Game Development: AI game testing + balance tuning
- vllm-project#22 Construction: Vision AI safety inspection
- vllm-project#23 Agriculture/Smart Farm: Crop monitoring + pest detection
- vllm-project#24 Government/Public: Document automation + citizen services
- vllm-project#25 Energy/Utilities: Grid monitoring + anomaly detection
- vllm-project#26 Environment/Sustainability: Carbon tracking + ESG reporting
- vllm-project#27 Fashion/Apparel: Trend analysis + inventory optimization
- vllm-project#28 Sports/Fitness: Performance analytics + tactical analysis
- vllm-project#29 Automotive/Mobility: Autonomous driving simulation
- vllm-project#30 Space/Aerospace: Satellite image analysis

Advanced Architecture Patterns:
1. Event-Driven Pattern: Webhook → Event Bus → Agent triggers
2. Streaming Pattern: Large dataset processing with chunking
3. Batch Processing Pattern: Celery-based parallel processing
4. Circuit Breaker Pattern: Fault tolerance + auto recovery
5. CQRS + Event Sourcing: Command/Query separation
6. Saga Pattern: Distributed transaction management

Guide now covers:
- 30+ industry-specific MCP implementations
- 6 production-ready architecture patterns
- Real-world scalability solutions
- Enterprise integration strategies
- Total: 8,672 lines (from 7,249)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants