Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
369d5dc
add support for remote server
alexsin368 May 1, 2025
0f6191d
add steps to enable remote server
alexsin368 May 2, 2025
71f1608
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 2, 2025
bbcda06
remove use_remote_service
alexsin368 May 2, 2025
03f81c6
Merge branch 'agent-remote-service' of https://github.com/alexsin368/…
alexsin368 May 3, 2025
45cf931
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 3, 2025
35c67df
Merge branch 'main' into agent-remote-service
yinghu5 May 12, 2025
836c0dd
Merge branch 'main' into agent-remote-service
alexsin368 May 13, 2025
4899f79
add OpenAI models instructions, fix format of commands
alexsin368 May 14, 2025
101d133
Merge branch 'main' into agent-remote-service
alexsin368 May 15, 2025
b7c4acf
simplify ChatOpenAI instantiation
alexsin368 May 15, 2025
6586657
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 15, 2025
d288734
Revert "simplify ChatOpenAI instantiation"
alexsin368 May 15, 2025
848368f
add back check and logic for llm_engine, set openai_key argument
alexsin368 May 15, 2025
7d01d77
Merge branch 'agent-remote-service' of https://github.com/alexsin368/…
alexsin368 May 15, 2025
53aaaa5
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 15, 2025
a70201f
Provide ARCH option for lvm-video-llama image build (#1630)
ZePan110 Apr 29, 2025
212e612
Add sglang microservice for supporting llama4 model (#1640)
lvliang-intel Apr 30, 2025
5fc478e
Remove invalid codeowner. (#1642)
ZePan110 Apr 30, 2025
1fe684c
add support for remote server
alexsin368 May 1, 2025
bd68f54
add steps to enable remote server
alexsin368 May 2, 2025
23f1f56
remove use_remote_service
alexsin368 May 2, 2025
d1d2ac1
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 2, 2025
a9d9ad7
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 3, 2025
1a1ff02
bug fix for chunk_size and overlap cause error in dataprep ingestion …
MSCetin37 May 2, 2025
11a79ff
MariaDB Vector integrations for retriever & dataprep services (#1645)
RazvanLiviuVarzaru May 6, 2025
5e1656f
update PR reviewers (#1651)
chensuyue May 7, 2025
69fea0d
Expand test matrix, find all tests use 3rd party Dockerfiles (#1676)
chensuyue May 7, 2025
388c264
fix the typo of README.md Comp (#1679)
yinghu5 May 10, 2025
3b42858
Fix request handle timeout issue (#1687)
lvliang-intel May 12, 2025
928e0f7
FEAT: Enable OPEA microservices to start as MCP servers (#1635)
Spycsh May 13, 2025
9be8f9f
Fix huggingface_hub API upgrade issue (#1691)
lvliang-intel May 13, 2025
0ffa6a6
add OpenAI models instructions, fix format of commands
alexsin368 May 14, 2025
f83070c
Fix dataprep opensearch ingest issue (#1697)
lvliang-intel May 14, 2025
72bc23b
Fix embedding issue with ArangoDB due to deprecated HuggingFace API (…
lvliang-intel May 14, 2025
b2d93ff
simplify ChatOpenAI instantiation
alexsin368 May 15, 2025
78001b0
Revert "simplify ChatOpenAI instantiation"
alexsin368 May 15, 2025
1f4b746
add back check and logic for llm_engine, set openai_key argument
alexsin368 May 15, 2025
79e0407
Merge branch 'agent-remote-service' of https://github.com/alexsin368/…
alexsin368 May 15, 2025
45376b9
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 15, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 27 additions & 1 deletion comps/agent/src/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ for line in resp.iter_lines(decode_unicode=True):

**Note**:

1. Currently only `reract_llama` agent is enabled for assistants APIs.
1. Currently only `react_llama` agent is enabled for assistants APIs.
2. Not all keywords of OpenAI APIs are supported yet.

### 1.5 Agent memory
Expand Down Expand Up @@ -110,6 +110,32 @@ Examples of python code for multi-turn conversations using agent memory:

To run the two examples above, first launch the agent microservice using [this docker compose yaml](../../../tests/agent/reactllama.yaml).

### 1.6 Run LLMs from OpenAI

To run any model from OpenAI, just specify the environment variable `OPENAI_API_KEY`:

```bash
export OPENAI_API_KEY=<openai-api-key>
```

These also need to be passed in to the `docker run` command, or included in a YAML file when running `docker compose`.

### 1.7 Run LLMs with OpenAI-compatible APIs on Remote Servers

To run the text generation portion using LLMs deployed on a remote server, specify the following environment variables:

```bash
export api_key=<openai-api-key>
export model=<model-card>
export LLM_ENDPOINT_URL=<inference-endpoint>
```

These also need to be passed in to the `docker run` command, or included in a YAML file when running `docker compose`.

#### Notes

- For `LLM_ENDPOINT_URL`, there is no need to include `v1`.

## 🚀2. Start Agent Microservice

### 2.1 Build docker image for agent microservice
Expand Down
3 changes: 3 additions & 0 deletions comps/agent/src/integrations/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@
if os.environ.get("llm_endpoint_url") is not None:
env_config += ["--llm_endpoint_url", os.environ["llm_endpoint_url"]]

if os.environ.get("api_key") is not None:
env_config += ["--api_key", os.environ["api_key"]]

if os.environ.get("llm_engine") is not None:
env_config += ["--llm_engine", os.environ["llm_engine"]]

Expand Down
25 changes: 15 additions & 10 deletions comps/agent/src/integrations/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@

from .config import env_config

LLM_ENDPOINT_URL_DEFAULT = "http://localhost:8080"


def format_date(date):
# input m/dd/yyyy hr:min
Expand Down Expand Up @@ -57,18 +59,20 @@ def setup_chat_model(args):
"streaming": args.stream,
}
if args.llm_engine == "vllm" or args.llm_engine == "tgi":
openai_endpoint = f"{args.llm_endpoint_url}/v1"
llm = ChatOpenAI(
openai_api_key="EMPTY",
openai_api_base=openai_endpoint,
model_name=args.model,
request_timeout=args.timeout,
**params,
)
openai_key = "EMPTY"
elif args.llm_engine == "openai":
llm = ChatOpenAI(model_name=args.model, request_timeout=args.timeout, **params)
openai_key = args.api_key
else:
raise ValueError("llm_engine must be vllm, tgi or openai")
raise ValueError("llm_engine must be vllm, tgi, or openai")

openai_endpoint = None if args.llm_endpoint_url is LLM_ENDPOINT_URL_DEFAULT else args.llm_endpoint_url + "/v1"
llm = ChatOpenAI(
openai_api_key=openai_key,
openai_api_base=openai_endpoint,
model_name=args.model,
request_timeout=args.timeout,
**params,
)
return llm


Expand Down Expand Up @@ -162,6 +166,7 @@ def get_args():
parser.add_argument("--model", type=str, default="meta-llama/Meta-Llama-3-8B-Instruct")
parser.add_argument("--llm_engine", type=str, default="tgi")
parser.add_argument("--llm_endpoint_url", type=str, default="http://localhost:8080")
parser.add_argument("--api_key", type=str, default=None, help="API key to access remote server")
parser.add_argument("--max_new_tokens", type=int, default=1024)
parser.add_argument("--top_k", type=int, default=10)
parser.add_argument("--top_p", type=float, default=0.95)
Expand Down