|
1 | | -# AVP — Agents Share Thoughts, Not Text |
| 1 | +# AVP – Agents Share Thoughts, Not Text |
2 | 2 |
|
3 | 3 | [](https://pypi.org/project/avp/) |
4 | 4 | [](https://github.com/VectorArc/avp-python/actions/workflows/ci.yml) |
5 | 5 | [](LICENSE) |
6 | 6 | [](https://python.org) |
7 | 7 | [](https://github.com/VectorArc/avp-spec) |
8 | 8 |
|
9 | | -When LLM agents hand off work as text, the next agent re-processes everything from scratch. AVP transfers the actual computation — KV-cache, hidden states, attention — so the receiving agent picks up where the sender left off. 46-78% fewer tokens, 2-4x faster. Sometimes more accurate than text. Built on [LatentMAS](https://arxiv.org/abs/2511.20639). |
| 9 | +When LLM agents hand off work as text, the next agent re-processes everything from scratch. AVP transfers the actual computation – KV-cache, hidden states, attention – so the receiving agent picks up where the sender left off. 46-78% fewer tokens, 2-4x faster. Sometimes more accurate than text. Built on [LatentMAS](https://arxiv.org/abs/2511.20639). |
10 | 10 |
|
11 | 11 | ```bash |
12 | 12 | pip install avp |
@@ -47,9 +47,9 @@ answer = connector.generate(prompt, context=context) |
47 | 47 | | Qwen 7B | Llama 3B | 74.5% | 47.0% | |
48 | 48 | | Llama 3B | Qwen 7B | **90.0%** | **79.3%** | |
49 | 49 |
|
50 | | -A small 3B model sharing its reasoning lifts a 7B solver to 90% on math and 79.3% on code. The projection is vocabulary-mediated — no learned parameters, no training data, works across model families. |
| 50 | +A small 3B model sharing its reasoning lifts a 7B solver to 90% on math and 79.3% on code. The projection is vocabulary-mediated – no learned parameters, no training data, works across model families. |
51 | 51 |
|
52 | | -Full results: **[Benchmarks](docs/BENCHMARKS.md)** — 8 benchmarks, 5 models, 2 families, reproducible. |
| 52 | +Full results: **[Benchmarks](docs/BENCHMARKS.md)** – 8 benchmarks, 5 models, 2 families, reproducible. |
53 | 53 |
|
54 | 54 | ## How It Works |
55 | 55 |
|
@@ -87,7 +87,7 @@ Replace `llm.invoke()` with `avp.generate()`. Your framework sees text in, text |
87 | 87 | | **CrewAI** | `BaseLLM.call()` override | |
88 | 88 | | **PydanticAI** | `FunctionModel` callback | |
89 | 89 | | **LlamaIndex** | `CustomLLM.complete()` override | |
90 | | -| **A2A / MCP** | Complementary — AVP handles tensor transfer, they handle routing | |
| 90 | +| **A2A / MCP** | Complementary – AVP handles tensor transfer, they handle routing | |
91 | 91 | | **HuggingFace** | Full latent pipeline (KV-cache + hidden states) | |
92 | 92 |
|
93 | 93 | See **[Framework Integration Guide](docs/FRAMEWORK_INTEGRATION.md)** for working examples. |
@@ -130,7 +130,7 @@ answer = avp.generate("Solve: 24 * 17 + 3", |
130 | 130 | <details> |
131 | 131 | <summary><strong>vLLM</strong></summary> |
132 | 132 |
|
133 | | -**Latent transfer is not supported on vLLM yet.** The latent pipeline (`think()`/`generate()` with context) requires HuggingFace Transformers. `VLLMConnector` exists for text-only generation and model identity — it will error if you pass latent context. vLLM latent support is on the roadmap. |
| 133 | +**Latent transfer is not supported on vLLM yet.** The latent pipeline (`think()`/`generate()` with context) requires HuggingFace Transformers. `VLLMConnector` exists for text-only generation and model identity – it will error if you pass latent context. vLLM latent support is on the roadmap. |
134 | 134 |
|
135 | 135 | </details> |
136 | 136 |
|
@@ -158,12 +158,12 @@ answer = connector.generate(prompt, context=restored) |
158 | 158 |
|
159 | 159 | ## Documentation |
160 | 160 |
|
161 | | -- **[AVP Specification](https://github.com/VectorArc/avp-spec)** — Binary format, handshake, transport |
162 | | -- **[Benchmarks](docs/BENCHMARKS.md)** — 8 benchmarks, 5 models, 2 families |
163 | | -- **[Framework Integration](docs/FRAMEWORK_INTEGRATION.md)** — LangGraph, CrewAI, PydanticAI, LlamaIndex |
164 | | -- **[Examples](examples/)** — Quickstart, cross-model, and agent demos |
| 161 | +- **[AVP Specification](https://github.com/VectorArc/avp-spec)** – Binary format, handshake, transport |
| 162 | +- **[Benchmarks](docs/BENCHMARKS.md)** – 8 benchmarks, 5 models, 2 families |
| 163 | +- **[Framework Integration](docs/FRAMEWORK_INTEGRATION.md)** – LangGraph, CrewAI, PydanticAI, LlamaIndex |
| 164 | +- **[Examples](examples/)** – Quickstart, cross-model, and agent demos |
165 | 165 | - **[CHANGELOG](CHANGELOG.md)** |
166 | 166 |
|
167 | 167 | ## License |
168 | 168 |
|
169 | | -Apache 2.0 — see [LICENSE](LICENSE) |
| 169 | +Apache 2.0 – see [LICENSE](LICENSE) |
0 commit comments