You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Welcome to Agent Lightning! Agent Lightning is a flexible and extensible training framework that enables seamless model training for any existing agent framework.
3
+
# Agent Lightning⚡
4
4
5
-
This guide will walk you through setting up and running the project.
5
+
Agent Lightning is a flexible framework that enables you to optimize any AI agent.
6
6
7
-
## Installation
7
+
## ⚡ Core Features
8
8
9
-
First, let's get your environment set up. We'll be using `/path/to/agentlightning` to refer to the directory containing this README file.
10
-
11
-
### 1. Set Up Your Environment
9
+
- Turn your agent into an optimizable beast with **ZERO CODE CHANGE** (almost)! 💤
10
+
- Build with **ANY** agent framework (LangChain, OpenAI Agent SDK, AutoGen, CrewAI, ...); or even WITHOUT agent framework (Python OpenAI). You name it! 🤖
11
+
-**Selectively** optimize one or more agents in a multi-agent system. 🎯
12
+
- Embraces Reinforcement Learning, Automatic Prompt Optimization and more **algorithms**. 🤗
12
13
13
-
We strongly recommend creating a new virtual environment to avoid conflicts with other packages. You can use either `conda` or `venv`. **Python 3.10 or later** is recommended.
Next, let's install the essential packages: `uv`, `PyTorch`, `FlashAttention`, and `vLLM`.
18
+
- 7/26/2025 [We discovered an approach to train any AI agent with RL, with (almost) zero code changes.](https://www.reddit.com/r/LocalLLaMA/comments/1m9m670/we_discovered_an_approach_to_train_any_ai_agent/) Reddit.
19
+
- 6/6/2025 [Agent Lightning - Microsoft Research](https://www.microsoft.com/en-us/research/project/agent-lightning/) Project page.
18
20
19
-
***Install `uv`** (This is required for some MCP agents; otherwise some agents might hang):
21
+
## ⚡ Installation
20
22
21
-
```bash
22
-
curl -LsSf https://astral.sh/uv/install.sh | sh
23
-
```
23
+
First, let's get your environment set up. We'll be using `/path/to/agentlightning` to refer to the directory containing this README file.
24
24
25
-
***Install `PyTorch`, `FlashAttention`, and `vLLM`**:
26
-
The following versions and installation order have been tested and are confirmed to work.
We strongly recommend creating a new virtual environment to avoid conflicts with other packages. You can use either `conda` or `venv`. **Python 3.10 or later** is recommended.
33
28
34
-
### 3. Install VERL
29
+
### 2. Install Core Training Dependencies (Optional)
35
30
36
-
Agent Lightning requires VERL forRL training. Please install the latest version from main branch.
31
+
If you are running RL with Agent-Lightning, the next step is to install the essential packages: `PyTorch`, `FlashAttention`, `vLLM` and `VERL`. The following versions and installation order have been tested and are confirmed to work.
See `scripts/setup_stable_gpu.sh` for a full installation script.
41
+
42
+
### 3. Install Agent Lightning
45
43
46
44
Now, you're ready to install Agent Lightning itself.
47
45
48
46
```bash
49
-
cd /path/to/agentlightning
50
-
pip install -e .
47
+
pip install agentlightning
51
48
```
52
49
53
-
### 5. Install Optional Frameworks
50
+
### 4. Install Agent Frameworks (Optional)
54
51
55
52
If you plan to use other agent frameworks, you can install them with the following commands. If you don't need these, feel free to skip this step.
53
+
We recommend doing this as the final step to avoid dependency versions being overwritten by mistake.
56
54
57
55
```bash
58
56
# AutoGen (Recommended to install first)
@@ -64,6 +62,9 @@ pip install "litellm[proxy]"
64
62
# MCP
65
63
pip install mcp
66
64
65
+
# UV
66
+
pip install uv
67
+
67
68
# OpenAI Agents
68
69
pip install openai-agents
69
70
@@ -76,19 +77,40 @@ pip install sqlparse nltk
76
77
77
78
Don't worry if dependency conflicts arise during this step. Follow the installation order above and the conflicts generally do not matter.
78
79
79
-
## Architecture
80
+
## ⚡ Examples
81
+
82
+
For more detailed examples, please see the `examples` folder:
83
+
84
+
1.[calc_x](examples/calc_x): An agent built with AutoGen with calculator tool use, trained on Calc-X dataset with Reinforcement Learning.
85
+
2.[spider](examples/spider): A write-check-rewrite looped agent with LangGraph with SQL execution; selectively optimize write and rewrite on Spider dataset with Reinforcement Learning.
86
+
3.[apo](examples/apo): An example to customize an optimization algorithm: Automatic Prompt Optimization.
87
+
88
+
## ⚡ Important Caveats
89
+
90
+
1.**AgentOps Integration**: Agent Lightning uses [AgentOps](https://github.com/AgentOps-AI/agentops) for agent tracking by default. If you're already using AgentOps in your own code, you'll need to disable our managed AgentOps client by modifying the `tracer` parameter of trainer.
91
+
2.**Debugging Traces**: If you encounter issues with tracing, you can visualize the trace tree using `tracer.last_trace().visualize("tree_graph")`. Please note that this API is experimental and may change in future releases.
92
+
3.**Launching the Server and Agents**: Currently, the training server and agent clients must be launched in separate processes. You can open two terminal windows or run one of them in the background. The launching order generally doesn't matter.
93
+
4.**Environment Variables**: The environment variables and working directory at the time of `ray init` are important. If you run into "file not found" errors, try restarting Ray from your current working directory.
94
+
5.**Handling Timeouts**: The training server may hang if samples fail or time out on the agent side. To prevent this, we recommend setting limits on the prompt and response lengths, as this is the most common cause of failures.
95
+
6.**VERL Failures**: Save checkpoints frequently, as VERL with vLLM may sometimes experience out-of-memory issues. If you encounter a VERL failure, you can resume training from the last checkpoint.
96
+
97
+
## ⚡ Architecture
80
98
81
99
Currently, Agent Lightning is built around a **training server** and one or multiple **agents**.
82
100
83
-
* The **server** manages the training data, prepares samples for the agents, and provides the LLM endpoint.
84
-
* **Agents** retrieve samples from the server, process them (which may involve interacting with the LLM), and send the results back. These results, or "trajectories," are lists of prompts and responses from the LLM.
85
-
* The **server** then collects these trajectories and computes the loss to optimize the language models.
101
+
* The **server** manages the training data, prepares samples for the agents, and provides the LLM endpoint.
102
+
***Agents** retrieve samples from the server, process them (which may involve interacting with the LLM), and send the results back. These results, or "trajectories," are lists of prompts and responses from the LLM.
103
+
* The **server** then collects these trajectories and computes the losses to optimize the language models.
pre-commit run --all-files --show-diff-on-failure --color=always
100
122
```
101
123
102
-
## Examples
124
+
## ⚡ Contributing
103
125
104
-
For more detailed examples, please see the `examples` folder.
126
+
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
105
127
106
-
## Important Caveats
128
+
When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
107
129
108
-
1. **AgentOps Integration**: Agent Lightning uses [AgentOps](https://github.com/AgentOps-AI/agentops) for agent tracking by default. If you're already using AgentOps in your own code, you'll need to disable our managed AgentOps client by modifying the `tracer` parameter of trainer.
130
+
This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/). For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or contact [[email protected]](mailto:[email protected]) with any additional questions or comments.
109
131
110
-
2. **Debugging Traces**: If you encounter issues with tracing, you can visualize the trace tree using `tracer.last_trace().visualize("tree_graph")`. Please note that this API is experimental and may change in future releases.
132
+
## ⚡ Trademarks
111
133
112
-
3. **Launching the Server and Agents**: Currently, the training server and agent clients must be launched in separate processes. You can open two terminal windows or run one of them in the background. The order in which you launch them generally doesn't matter.
134
+
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow [Microsoft's Trademark & Brand Guidelines](https://www.microsoft.com/en-us/legal/intellectualproperty/trademarks/usage/general). Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.
113
135
114
-
4. **Environment Variables**: The environment variables and working directory at the time of `ray init` are important. If you run into "file not found" errors, try restarting Ray from your current working directory.
136
+
## ⚡ Responsible AI
115
137
116
-
5. **Handling Timeouts**: The training server may hang if samples fail or time out on the agent side. To prevent this, we recommend setting limits on the prompt and response lengths, as this is the most common cause of failures.
138
+
This project has been evaluated and certified to comply with the Microsoft Responsible AI Standard. The team will continue to monitor and maintain the repository, addressing any severe issues, including potential harms, if they arise.
117
139
118
-
6. **VERL Failures**: Save checkpoints frequently, as VERL with vLLM may sometimes experience out-of-memory issues. If you encounter a VERL failure, you can resume training from the last checkpoint.
140
+
## ⚡ License
141
+
142
+
This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.
Copy file name to clipboardExpand all lines: examples/calc_x/README.md
+4Lines changed: 4 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,3 +6,7 @@ This example requires a single node with one GPU of at least 40GB memory.
6
6
2. Start ray: `bash ../../scripts/restart_ray.sh`. To use Wandb, you need to set the WANDB_API_KEY environment variable before starting ray.
7
7
3. Run the agent: `python calc_agent.py`. It automatically launches 4 agent workers by default.
8
8
4. In another terminal, launch the training server: `bash train.sh`.
9
+
10
+
## Common Issues
11
+
12
+
1. The agent client will hang indefinitely if the environment is not properly configured. Check if uv and mcp are properly installed. Use `tests/test_mcp_calculator.py` to verify the installation.
0 commit comments