Skip to content

Commit c7a4085

Browse files
DarkLight1337amitm02
authored andcommitted
[Doc] Move examples and further reorganize user guide (vllm-project#18666)
Signed-off-by: DarkLight1337 <[email protected]> Signed-off-by: amit <[email protected]>
1 parent 82b8f2c commit c7a4085

27 files changed

+31
-42
lines changed

.buildkite/pyproject.toml

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -6,11 +6,6 @@
66

77
[tool.ruff]
88
line-length = 88
9-
exclude = [
10-
# External file, leaving license intact
11-
"examples/other/fp8/quantizer/quantize.py",
12-
"vllm/vllm_flash_attn/flash_attn_interface.pyi"
13-
]
149

1510
[tool.ruff.lint.per-file-ignores]
1611
"vllm/third_party/**" = ["ALL"]

.buildkite/test-pipeline.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -246,7 +246,7 @@ steps:
246246
- python3 offline_inference/vision_language.py --seed 0
247247
- python3 offline_inference/vision_language_embedding.py --seed 0
248248
- python3 offline_inference/vision_language_multi_image.py --seed 0
249-
- VLLM_USE_V1=0 python3 other/tensorize_vllm_model.py --model facebook/opt-125m serialize --serialized-directory /tmp/ --suffix v1 && python3 other/tensorize_vllm_model.py --model facebook/opt-125m deserialize --path-to-tensors /tmp/vllm/facebook/opt-125m/v1/model.tensors
249+
- VLLM_USE_V1=0 python3 others/tensorize_vllm_model.py --model facebook/opt-125m serialize --serialized-directory /tmp/ --suffix v1 && python3 others/tensorize_vllm_model.py --model facebook/opt-125m deserialize --path-to-tensors /tmp/vllm/facebook/opt-125m/v1/model.tensors
250250
- python3 offline_inference/encoder_decoder.py
251251
- python3 offline_inference/encoder_decoder_multimodal.py --model-type whisper --seed 0
252252
- python3 offline_inference/basic/classify.py

.gitignore

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -146,7 +146,7 @@ venv.bak/
146146

147147
# mkdocs documentation
148148
/site
149-
docs/getting_started/examples
149+
docs/examples
150150

151151
# mypy
152152
.mypy_cache/

benchmarks/pyproject.toml

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -6,11 +6,6 @@
66

77
[tool.ruff]
88
line-length = 88
9-
exclude = [
10-
# External file, leaving license intact
11-
"examples/other/fp8/quantizer/quantize.py",
12-
"vllm/vllm_flash_attn/flash_attn_interface.pyi"
13-
]
149

1510
[tool.ruff.lint.per-file-ignores]
1611
"vllm/third_party/**" = ["ALL"]

docs/.nav.yml

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -5,11 +5,9 @@ nav:
55
- getting_started/quickstart.md
66
- getting_started/installation
77
- Examples:
8-
- Offline Inference: getting_started/examples/offline_inference
9-
- Online Serving: getting_started/examples/online_serving
10-
- Others:
11-
- LMCache: getting_started/examples/lmcache
12-
- getting_started/examples/other/*
8+
- Offline Inference: examples/offline_inference
9+
- Online Serving: examples/online_serving
10+
- Others: examples/others
1311
- Quick Links:
1412
- User Guide: usage/README.md
1513
- Developer Guide: contributing/README.md
@@ -19,6 +17,7 @@ nav:
1917
- Releases: https://github.com/vllm-project/vllm/releases
2018
- User Guide:
2119
- Summary: usage/README.md
20+
- usage/v1_guide.md
2221
- General:
2322
- usage/*
2423
- Inference and Serving:

docs/configuration/README.md

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,9 @@
11
# Configuration Options
22

3-
This section lists the most common options for running the vLLM engine.
4-
For a full list, refer to the [configuration][configuration] page.
3+
This section lists the most common options for running vLLM.
4+
5+
There are three main levels of configuration, from highest priority to lowest priority:
6+
7+
- [Request parameters][completions-api] and [input arguments][sampling-params]
8+
- [Engine arguments](./engine_args.md)
9+
- [Environment variables](./env_vars.md)
File renamed without changes.

docs/design/v1/metrics.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -61,7 +61,7 @@ These are documented under [Inferencing and Serving -> Production Metrics](../..
6161

6262
### Grafana Dashboard
6363

64-
vLLM also provides [a reference example](https://docs.vllm.ai/en/latest/getting_started/examples/prometheus_grafana.html) for how to collect and store these metrics using Prometheus and visualize them using a Grafana dashboard.
64+
vLLM also provides [a reference example](https://docs.vllm.ai/en/latest/examples/prometheus_grafana.html) for how to collect and store these metrics using Prometheus and visualize them using a Grafana dashboard.
6565

6666
The subset of metrics exposed in the Grafana dashboard gives us an indication of which metrics are especially important:
6767

@@ -673,7 +673,7 @@ v0 has support for OpenTelemetry tracing:
673673
- [OpenTelemetry blog
674674
post](https://opentelemetry.io/blog/2024/llm-observability/)
675675
- [User-facing
676-
docs](https://docs.vllm.ai/en/latest/getting_started/examples/opentelemetry.html)
676+
docs](https://docs.vllm.ai/en/latest/examples/opentelemetry.html)
677677
- [Blog
678678
post](https://medium.com/@ronen.schaffer/follow-the-trail-supercharging-vllm-with-opentelemetry-distributed-tracing-aa655229b46f)
679679
- [IBM product

docs/mkdocs/hooks/generate_examples.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99
ROOT_DIR = Path(__file__).parent.parent.parent.parent
1010
ROOT_DIR_RELATIVE = '../../../../..'
1111
EXAMPLE_DIR = ROOT_DIR / "examples"
12-
EXAMPLE_DOC_DIR = ROOT_DIR / "docs/getting_started/examples"
12+
EXAMPLE_DOC_DIR = ROOT_DIR / "docs/examples"
1313
print(ROOT_DIR.resolve())
1414
print(EXAMPLE_DIR.resolve())
1515
print(EXAMPLE_DOC_DIR.resolve())

docs/models/extensions/tensorizer.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ shorter Pod startup times and CPU memory usage. Tensor encryption is also suppor
1010

1111
For more information on CoreWeave's Tensorizer, please refer to
1212
[CoreWeave's Tensorizer documentation](https://github.com/coreweave/tensorizer). For more information on serializing a vLLM model, as well a general usage guide to using Tensorizer with vLLM, see
13-
the [vLLM example script](https://docs.vllm.ai/en/latest/getting_started/examples/tensorize_vllm_model.html).
13+
the [vLLM example script](https://docs.vllm.ai/en/latest/examples/tensorize_vllm_model.html).
1414

1515
!!! note
1616
Note that to use this feature you will need to install `tensorizer` by running `pip install vllm[tensorizer]`.

0 commit comments

Comments
 (0)