[Doc]: Performance/Optimization Page doesn't mention Pipeline Parallel Size

### 📚 The doc issue

In the Page
https://github.com/vllm-project/vllm/blob/main/docs/source/performance/optimization.md

One of the recommended options includes the following:

```
Increase tensor_parallel_size. This approach shards model weights, so each GPU has more memory available for KV cache.
```

This document does not mention increasing `pipeline_parallel_size` which would also result in the model being sharded across more GPUs so their is more memory available for KV cache.

### Suggest a potential alternative/fix

Increase `tensor_parallel_size` or `pipeline_parallel_size` (if using Multi-Node Multi-GPU). This approach shards model weights, so each GPU has more memory available for KV cache.

### Before submitting a new issue...

- [X] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Doc]: Performance/Optimization Page doesn't mention Pipeline Parallel Size #12012

📚 The doc issue

Suggest a potential alternative/fix

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Doc]: Performance/Optimization Page doesn't mention Pipeline Parallel Size #12012

Description

📚 The doc issue

Suggest a potential alternative/fix

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions