microsoft
diff --git a/‎README.md‎
Lines changed: 68 additions & 44 deletions b/‎README.md‎
Lines changed: 68 additions & 44 deletions
diff --git a/‎assets/readme-architecture.png‎
598 KB b/‎assets/readme-architecture.png‎
598 KB
diff --git a/‎assets/readme-banner.png‎
166 KB b/‎assets/readme-banner.png‎
166 KB
diff --git a/‎assets/readme-diff.png‎
1.06 MB b/‎assets/readme-diff.png‎
1.06 MB
diff --git a/‎examples/calc_x/README.md‎
Lines changed: 4 additions & 0 deletions b/‎examples/calc_x/README.md‎
Lines changed: 4 additions & 0 deletions
diff --git a/‎scripts/setup_stable_gpu.sh‎
Lines changed: 1 addition & 6 deletions b/‎scripts/setup_stable_gpu.sh‎
Lines changed: 1 addition & 6 deletions
@@ -1,58 +1,56 @@
-# Agent Lightning
+![Agent-lightning-banner](assets/readme-banner.png)
 
-Welcome to Agent Lightning! Agent Lightning is a flexible and extensible training framework that enables seamless model training for any existing agent framework.
+# Agent Lightning⚡
 
-This guide will walk you through setting up and running the project.
+Agent Lightning is a flexible framework that enables you to optimize any AI agent.
 
-## Installation
+## ⚡ Core Features
 
-First, let's get your environment set up. We'll be using `/path/to/agentlightning` to refer to the directory containing this README file.
-
-### 1. Set Up Your Environment
+- Turn your agent into an optimizable beast with **ZERO CODE CHANGE** (almost)! 💤
+- Build with **ANY** agent framework (LangChain, OpenAI Agent SDK, AutoGen, CrewAI, ...); or even WITHOUT agent framework (Python OpenAI). You name it! 🤖
+- **Selectively** optimize one or more agents in a multi-agent system. 🎯
+- Embraces Reinforcement Learning, Automatic Prompt Optimization and more **algorithms**. 🤗
 
-We strongly recommend creating a new virtual environment to avoid conflicts with other packages. You can use either `conda` or `venv`. **Python 3.10 or later** is recommended.
+![Agent-Lightning-code-diff](assets/readme-diff.png)
 
-### 2. Install Core Dependencies
+## ⚡ Resources
 
-Next, let's install the essential packages: `uv`, `PyTorch`, `FlashAttention`, and `vLLM`.
+- 7/26/2025 [We discovered an approach to train any AI agent with RL, with (almost) zero code changes.](https://www.reddit.com/r/LocalLLaMA/comments/1m9m670/we_discovered_an_approach_to_train_any_ai_agent/) Reddit.
+- 6/6/2025 [Agent Lightning - Microsoft Research](https://www.microsoft.com/en-us/research/project/agent-lightning/) Project page.
 
-  * **Install `uv`** (This is required for some MCP agents; otherwise some agents might hang):
+## ⚡ Installation
 
-    ```bash
-    curl -LsSf https://astral.sh/uv/install.sh | sh
-    ```
+First, let's get your environment set up. We'll be using `/path/to/agentlightning` to refer to the directory containing this README file.
 
-  * **Install `PyTorch`, `FlashAttention`, and `vLLM`**:
-    The following versions and installation order have been tested and are confirmed to work.
+### 1. Set Up Your Environment
 
-    ```bash
-    pip install torch==2.7.0 torchvision==0.22.0 torchaudio==2.7.0 --index-url https://download.pytorch.org/whl/cu128
-    pip install flash-attn --no-build-isolation
-    pip install vllm==0.9.2
-    ```
+We strongly recommend creating a new virtual environment to avoid conflicts with other packages. You can use either `conda` or `venv`. **Python 3.10 or later** is recommended.
 
-### 3. Install VERL
+### 2. Install Core Training Dependencies (Optional)
 
-Agent Lightning requires VERL for RL training. Please install the latest version from main branch.
+If you are running RL with Agent-Lightning, the next step is to install the essential packages: `PyTorch`, `FlashAttention`, `vLLM` and `VERL`. The following versions and installation order have been tested and are confirmed to work.
 
 ```bash
-git clone https://github.com/volcengine/verl /path/to/your/verl
-cd /path/to/your/verl
-pip install -e .
+pip install torch==2.7.0 torchvision==0.22.0 torchaudio==2.7.0 --index-url https://download.pytorch.org/whl/cu128
+pip install flash-attn --no-build-isolation
+pip install vllm==0.9.2
+pip install verl==0.5.0
 ```
 
-### 4. Install Agent Lightning
+See `scripts/setup_stable_gpu.sh` for a full installation script.
+
+### 3. Install Agent Lightning
 
 Now, you're ready to install Agent Lightning itself.
 
 ```bash
-cd /path/to/agentlightning
-pip install -e .
+pip install agentlightning
 ```
 
-### 5. Install Optional Frameworks
+### 4. Install Agent Frameworks (Optional)
 
 If you plan to use other agent frameworks, you can install them with the following commands. If you don't need these, feel free to skip this step.
+We recommend doing this as the final step to avoid dependency versions being overwritten by mistake.
 
 ```bash
 # AutoGen (Recommended to install first)
@@ -64,6 +62,9 @@ pip install "litellm[proxy]"
 # MCP
 pip install mcp
 
+# UV
+pip install uv
+
 # OpenAI Agents
 pip install openai-agents
 
@@ -76,19 +77,40 @@ pip install sqlparse nltk
 
 Don't worry if dependency conflicts arise during this step. Follow the installation order above and the conflicts generally do not matter.
 
-## Architecture
+## ⚡ Examples
+
+For more detailed examples, please see the `examples` folder:
+
+1. [calc_x](examples/calc_x): An agent built with AutoGen with calculator tool use, trained on Calc-X dataset with Reinforcement Learning.
+2. [spider](examples/spider): A write-check-rewrite looped agent with LangGraph with SQL execution; selectively optimize write and rewrite on Spider dataset with Reinforcement Learning.
+3. [apo](examples/apo): An example to customize an optimization algorithm: Automatic Prompt Optimization.
+
+## ⚡ Important Caveats
+
+1.  **AgentOps Integration**: Agent Lightning uses [AgentOps](https://github.com/AgentOps-AI/agentops) for agent tracking by default. If you're already using AgentOps in your own code, you'll need to disable our managed AgentOps client by modifying the `tracer` parameter of trainer.
+2.  **Debugging Traces**: If you encounter issues with tracing, you can visualize the trace tree using `tracer.last_trace().visualize("tree_graph")`. Please note that this API is experimental and may change in future releases.
+3.  **Launching the Server and Agents**: Currently, the training server and agent clients must be launched in separate processes. You can open two terminal windows or run one of them in the background. The launching order generally doesn't matter.
+4.  **Environment Variables**: The environment variables and working directory at the time of `ray init` are important. If you run into "file not found" errors, try restarting Ray from your current working directory.
+5.  **Handling Timeouts**: The training server may hang if samples fail or time out on the agent side. To prevent this, we recommend setting limits on the prompt and response lengths, as this is the most common cause of failures.
+6.  **VERL Failures**: Save checkpoints frequently, as VERL with vLLM may sometimes experience out-of-memory issues. If you encounter a VERL failure, you can resume training from the last checkpoint.
+
+## ⚡ Architecture
 
 Currently, Agent Lightning is built around a **training server** and one or multiple **agents**.
 
-  * The **server** manages the training data, prepares samples for the agents, and provides the LLM endpoint.
-  * **Agents** retrieve samples from the server, process them (which may involve interacting with the LLM), and send the results back. These results, or "trajectories," are lists of prompts and responses from the LLM.
-  * The **server** then collects these trajectories and computes the loss to optimize the language models.
+* The **server** manages the training data, prepares samples for the agents, and provides the LLM endpoint.
+* **Agents** retrieve samples from the server, process them (which may involve interacting with the LLM), and send the results back. These results, or "trajectories," are lists of prompts and responses from the LLM.
+* The **server** then collects these trajectories and computes the losses to optimize the language models.
+
+![Agent-Lightning-architecture](assets/readme-architecture.png)
 
-## Development Instructions
+## ⚡ Development Instructions
 
 Install with development dependencies:
 
 ```
+git clone https://github.com/microsoft/agent-lightning
+cd agent-lightning
 pip install -e .[dev]
 ```
 
@@ -99,20 +121,22 @@ pre-commit install
 pre-commit run --all-files --show-diff-on-failure --color=always
 ```
 
-## Examples
+## ⚡ Contributing
 
-For more detailed examples, please see the `examples` folder.
+This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
 
-## Important Caveats
+When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
 
-1.  **AgentOps Integration**: Agent Lightning uses [AgentOps](https://github.com/AgentOps-AI/agentops) for agent tracking by default. If you're already using AgentOps in your own code, you'll need to disable our managed AgentOps client by modifying the `tracer` parameter of trainer.
+This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/). For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or contact [[email protected]](mailto:[email protected]) with any additional questions or comments.
 
-2.  **Debugging Traces**: If you encounter issues with tracing, you can visualize the trace tree using `tracer.last_trace().visualize("tree_graph")`. Please note that this API is experimental and may change in future releases.
+## ⚡ Trademarks
 
-3.  **Launching the Server and Agents**: Currently, the training server and agent clients must be launched in separate processes. You can open two terminal windows or run one of them in the background. The order in which you launch them generally doesn't matter.
+This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow [Microsoft's Trademark & Brand Guidelines](https://www.microsoft.com/en-us/legal/intellectualproperty/trademarks/usage/general). Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.
 
-4.  **Environment Variables**: The environment variables and working directory at the time of `ray init` are important. If you run into "file not found" errors, try restarting Ray from your current working directory.
+## ⚡ Responsible AI
 
-5.  **Handling Timeouts**: The training server may hang if samples fail or time out on the agent side. To prevent this, we recommend setting limits on the prompt and response lengths, as this is the most common cause of failures.
+This project has been evaluated and certified to comply with the Microsoft Responsible AI Standard. The team will continue to monitor and maintain the repository, addressing any severe issues, including potential harms, if they arise.
 
-6.  **VERL Failures**: Save checkpoints frequently, as VERL with vLLM may sometimes experience out-of-memory issues. If you encounter a VERL failure, you can resume training from the last checkpoint.
+## ⚡ License
+
+This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.
@@ -6,3 +6,7 @@ This example requires a single node with one GPU of at least 40GB memory.
 2. Start ray: `bash ../../scripts/restart_ray.sh`. To use Wandb, you need to set the WANDB_API_KEY environment variable before starting ray.
 3. Run the agent: `python calc_agent.py`. It automatically launches 4 agent workers by default.
 4. In another terminal, launch the training server: `bash train.sh`.
+
+## Common Issues
+
+1. The agent client will hang indefinitely if the environment is not properly configured. Check if uv and mcp are properly installed. Use `tests/test_mcp_calculator.py` to verify the installation.
@@ -7,11 +7,6 @@ pip install --no-cache-dir torch==2.7.0 torchvision==0.22.0 torchaudio==2.7.0 --
 pip install --no-cache-dir transformers==4.53.3
 pip install --no-cache-dir flash-attn==2.8.1 --no-build-isolation
 pip install --no-cache-dir vllm==0.9.2
-
-git clone https://github.com/volcengine/verl
-cd verl
-git checkout 1df03f3abf96f59cb90c684f93a71ee0bbb57f49
-pip install --no-cache-dir -e .
-cd ..
+pip install --no-cache-dir verl==0.5.0
 
 pip install --no-cache-dir -e .[dev,agent]