From 75a0582b72b072863f27fe4666d7852c6821ac5e Mon Sep 17 00:00:00 2001
From: sergiopaniego <sergiopaniegoblanco@gmail.com>
Date: Tue, 7 Oct 2025 14:35:02 +0200
Subject: [PATCH 1/7] Update max_length explanation for VLMs

---
 docs/source/grpo_trainer.md | 10 ++++++++--
 docs/source/rloo_trainer.md | 10 ++++++++--
 2 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/docs/source/grpo_trainer.md b/docs/source/grpo_trainer.md
index a8d058d4194..3dce352a363 100644
--- a/docs/source/grpo_trainer.md
+++ b/docs/source/grpo_trainer.md
@@ -567,8 +567,14 @@ accelerate launch \
 
 ### Configuration Tips
 
-> [!WARNING]
-> VLM training may fail if image tokens are truncated. We highly recommend disabling truncation by setting `max_prompt_length` to `None`.
+> [!TIP]
+> For VLMs, truncating may remove image tokens, leading to errors during training. To avoid this, set `max_length=None` in the [`GRPOConfig`]. This allows the model to process the full sequence length without truncating image tokens.
+>
+> ```python
+> GRPOConfig(max_length=None, ...)
+> ```
+>
+> Only use `max_length` when you've verified that truncation won't remove image tokens for the entire dataset.
 
 - Use LoRA on vision-language projection layers
 - Enable 4-bit quantization to reduce memory usage
diff --git a/docs/source/rloo_trainer.md b/docs/source/rloo_trainer.md
index 891a0bcb0f0..814c77620f5 100644
--- a/docs/source/rloo_trainer.md
+++ b/docs/source/rloo_trainer.md
@@ -549,8 +549,14 @@ accelerate launch \
 
 ### Configuration Tips
 
-> [!WARNING]
-> VLM training may fail if image tokens are truncated. We highly recommend disabling truncation by setting `max_prompt_length` to `None`.
+> [!TIP]
+> For VLMs, truncating may remove image tokens, leading to errors during training. To avoid this, set `max_length=None` in the [`RLOOConfig`]. This allows the model to process the full sequence length without truncating image tokens.
+>
+> ```python
+> RLOOConfig(max_length=None, ...)
+> ```
+>
+> Only use `max_length` when you've verified that truncation won't remove image tokens for the entire dataset.
 
 - Use LoRA on vision-language projection layers
 - Enable 4-bit quantization to reduce memory usage

From 5634d26933bcca3efe9e5ca3851e32630cc90c32 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Quentin=20Gallou=C3=A9dec?=
 <45557362+qgallouedec@users.noreply.github.com>
Date: Tue, 4 Nov 2025 18:01:39 -0700
Subject: [PATCH 2/7] Apply suggestion from @qgallouedec

---
 docs/source/grpo_trainer.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/source/grpo_trainer.md b/docs/source/grpo_trainer.md
index 3dce352a363..2b7c81e00ee 100644
--- a/docs/source/grpo_trainer.md
+++ b/docs/source/grpo_trainer.md
@@ -568,7 +568,7 @@ accelerate launch \
 ### Configuration Tips
 
 > [!TIP]
-> For VLMs, truncating may remove image tokens, leading to errors during training. To avoid this, set `max_length=None` in the [`GRPOConfig`]. This allows the model to process the full sequence length without truncating image tokens.
+> For VLMs, truncating may remove image tokens, leading to errors during training. To avoid this, set `max_prompt_length=None` in the [`GRPOConfig`]. This allows the model to process the full sequence length without truncating image tokens.
 >
 > ```python
 > GRPOConfig(max_length=None, ...)

From ba5752809bb811b9e7329b41ec0c3c51b92000ae Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Quentin=20Gallou=C3=A9dec?=
 <45557362+qgallouedec@users.noreply.github.com>
Date: Tue, 4 Nov 2025 18:01:47 -0700
Subject: [PATCH 3/7] Apply suggestion from @qgallouedec

---
 docs/source/grpo_trainer.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/source/grpo_trainer.md b/docs/source/grpo_trainer.md
index 2b7c81e00ee..e016958c966 100644
--- a/docs/source/grpo_trainer.md
+++ b/docs/source/grpo_trainer.md
@@ -574,7 +574,7 @@ accelerate launch \
 > GRPOConfig(max_length=None, ...)
 > ```
 >
-> Only use `max_length` when you've verified that truncation won't remove image tokens for the entire dataset.
+> Only use `max_prompt_length` when you've verified that truncation won't remove image tokens for the entire dataset.
 
 - Use LoRA on vision-language projection layers
 - Enable 4-bit quantization to reduce memory usage

From 5a776a8a4a14774e922b1816ba18e07a557e2cd6 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Quentin=20Gallou=C3=A9dec?=
 <45557362+qgallouedec@users.noreply.github.com>
Date: Tue, 4 Nov 2025 18:01:53 -0700
Subject: [PATCH 4/7] Apply suggestion from @qgallouedec

---
 docs/source/grpo_trainer.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/source/grpo_trainer.md b/docs/source/grpo_trainer.md
index e016958c966..8d458e66e27 100644
--- a/docs/source/grpo_trainer.md
+++ b/docs/source/grpo_trainer.md
@@ -571,7 +571,7 @@ accelerate launch \
 > For VLMs, truncating may remove image tokens, leading to errors during training. To avoid this, set `max_prompt_length=None` in the [`GRPOConfig`]. This allows the model to process the full sequence length without truncating image tokens.
 >
 > ```python
-> GRPOConfig(max_length=None, ...)
+> GRPOConfig(max_prompt_length=None, ...)
 > ```
 >
 > Only use `max_prompt_length` when you've verified that truncation won't remove image tokens for the entire dataset.

From 3beb1bf723cfdaa829be661335484e9f08709db6 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Quentin=20Gallou=C3=A9dec?=
 <45557362+qgallouedec@users.noreply.github.com>
Date: Tue, 4 Nov 2025 18:02:06 -0700
Subject: [PATCH 5/7] Apply suggestion from @qgallouedec

---
 docs/source/rloo_trainer.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/source/rloo_trainer.md b/docs/source/rloo_trainer.md
index 814c77620f5..0e11b01dae3 100644
--- a/docs/source/rloo_trainer.md
+++ b/docs/source/rloo_trainer.md
@@ -550,7 +550,7 @@ accelerate launch \
 ### Configuration Tips
 
 > [!TIP]
-> For VLMs, truncating may remove image tokens, leading to errors during training. To avoid this, set `max_length=None` in the [`RLOOConfig`]. This allows the model to process the full sequence length without truncating image tokens.
+> For VLMs, truncating may remove image tokens, leading to errors during training. To avoid this, set `max_prompt_length=None` in the [`RLOOConfig`]. This allows the model to process the full sequence length without truncating image tokens.
 >
 > ```python
 > RLOOConfig(max_length=None, ...)

From 556009885a0a8ac83e7b488259b85f12a4cc33e1 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Quentin=20Gallou=C3=A9dec?=
 <45557362+qgallouedec@users.noreply.github.com>
Date: Tue, 4 Nov 2025 18:02:12 -0700
Subject: [PATCH 6/7] Apply suggestion from @qgallouedec

---
 docs/source/rloo_trainer.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/source/rloo_trainer.md b/docs/source/rloo_trainer.md
index 0e11b01dae3..29f8331f7b1 100644
--- a/docs/source/rloo_trainer.md
+++ b/docs/source/rloo_trainer.md
@@ -553,7 +553,7 @@ accelerate launch \
 > For VLMs, truncating may remove image tokens, leading to errors during training. To avoid this, set `max_prompt_length=None` in the [`RLOOConfig`]. This allows the model to process the full sequence length without truncating image tokens.
 >
 > ```python
-> RLOOConfig(max_length=None, ...)
+> RLOOConfig(max_prompt_length=None, ...)
 > ```
 >
 > Only use `max_length` when you've verified that truncation won't remove image tokens for the entire dataset.

From 24272073a61c39e5d3c7c0f2b59d83b1712716bb Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Quentin=20Gallou=C3=A9dec?=
 <45557362+qgallouedec@users.noreply.github.com>
Date: Tue, 4 Nov 2025 18:02:17 -0700
Subject: [PATCH 7/7] Apply suggestion from @qgallouedec

---
 docs/source/rloo_trainer.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/source/rloo_trainer.md b/docs/source/rloo_trainer.md
index 29f8331f7b1..9ab999b59cb 100644
--- a/docs/source/rloo_trainer.md
+++ b/docs/source/rloo_trainer.md
@@ -556,7 +556,7 @@ accelerate launch \
 > RLOOConfig(max_prompt_length=None, ...)
 > ```
 >
-> Only use `max_length` when you've verified that truncation won't remove image tokens for the entire dataset.
+> Only use `max_prompt_length` when you've verified that truncation won't remove image tokens for the entire dataset.
 
 - Use LoRA on vision-language projection layers
 - Enable 4-bit quantization to reduce memory usage