Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/CODEOWNERS
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
/AgentQnA/ abolfazl.shahbazi@intel.com kaokao.lv@intel.com minmin.hou@intel.com xinyu.ye@intel.com
/AudioQnA/ sihan.chen@intel.com wenjiao.yue@intel.com
/AvatarChatbot/ chun.tao@intel.com kaokao.lv@intel.com xinyu.ye@intel.com
/BrowserUseAgent/ letong.han@intel.com yi.a.yao@intel.com
/ChatQnA/ liang1.lv@intel.com letong.han@intel.com
/CodeGen/ liang1.lv@intel.com qing.yao@intel.com
/CodeTrans/ sihan.chen@intel.com letong.han@intel.com
Expand Down
21 changes: 21 additions & 0 deletions BrowserUseAgent/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Copyright (C) 2025 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

ARG IMAGE_REPO=opea
ARG BASE_TAG=latest
FROM $IMAGE_REPO/comps-base:$BASE_TAG

USER root

COPY ./requirements.txt $HOME/requirements.txt
COPY ./browser_use_agent.py $HOME/browser_use_agent.py

ARG uvpip='uv pip install --system --no-cache-dir'
RUN uv pip install --system --upgrade pip setuptools uv && \
$uvpip pytest-playwright && \
playwright install chromium --with-deps --no-shell && \
$uvpip -r requirements.txt && \
$uvpip posthog==5.4.0

USER user
ENTRYPOINT ["python", "browser_use_agent.py"]
18 changes: 18 additions & 0 deletions BrowserUseAgent/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# Browser-use Agent Application

Browser-use agent empowers anyone to automate repetitive web tasks. It controls your web browser to perform tasks like visiting websites and extracting data. The application is powered by [browser-use](https://github.com/browser-use/browser-use) and OPEA LLM serving microservice.

## Deployment Options

The table below lists currently available deployment options. They outline in detail the implementation of this example on selected hardware.

| Category | Deployment Option | Description |
| ---------------------- | ---------------------- | ----------------------------------------------------------------- |
| On-premise Deployments | Docker Compose (Gaudi) | [Deployment on Gaudi](./docker_compose/intel/hpu/gaudi/README.md) |

## Validated Configurations

| **Deploy Method** | **LLM Engine** | **LLM Model** | **Hardware** |
| ----------------- | -------------- | ---------------------------- | ------------ |
| Docker Compose | vLLM | Qwen/Qwen2.5-VL-32B-Instruct | Intel Gaudi |
| Docker Compose | vLLM | Qwen/Qwen2.5-VL-72B-Instruct | Intel Gaudi |
90 changes: 90 additions & 0 deletions BrowserUseAgent/browser_use_agent.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0


import os

from browser_use import Agent, BrowserProfile
from comps import opea_microservices, register_microservice
from comps.cores.telemetry.opea_telemetry import opea_telemetry
from fastapi import Request
from langchain_openai import ChatOpenAI
from pydantic import BaseModel, SecretStr

LLM = None
BROWSER_PROFILE = None
LLM_ENDPOINT = os.getenv("LLM_ENDPOINT", "http://0.0.0.0:8008")
LLM_MODEL = os.getenv("LLM_MODEL", "Qwen/Qwen2.5-VL-32B-Instruct")


def initiate_llm_and_browser(llm_endpoint: str, model: str, secret_key: str = "sk-xxxxxx"):
# Initialize global LLM and BrowserProfile if not already initialized
global LLM, BROWSER_PROFILE
if not LLM:
LLM = ChatOpenAI(base_url=f"{llm_endpoint}/v1", model=model, api_key=SecretStr(secret_key), temperature=0.1)
if not BROWSER_PROFILE:
BROWSER_PROFILE = BrowserProfile(
headless=True,
chromium_sandbox=False,
)
return LLM, BROWSER_PROFILE


class BrowserUseRequest(BaseModel):
task_prompt: str
use_vision: bool = True
secret_key: str = "sk-xxxxxx"
llm_endpoint: str = LLM_ENDPOINT
llm_model: str = LLM_MODEL
agent_max_steps: int = 10


class BrowserUseResponse(BaseModel):
is_success: bool = False
model: str
task_prompt: str
use_vision: bool
agent_researched_urls: list[str] = []
agent_actions: list[str] = []
agent_durations: float
agent_steps: int
final_result: str


@register_microservice(
name="opea_service@browser_use_agent",
endpoint="/v1/browser_use_agent",
host="0.0.0.0",
port=8022,
)
@opea_telemetry
async def run(request: Request):
data = await request.json()
chat_request = BrowserUseRequest.model_validate(data)
llm, browser_profile = initiate_llm_and_browser(
llm_endpoint=chat_request.llm_endpoint, model=chat_request.llm_model, secret_key=chat_request.secret_key
)
agent = Agent(
task=chat_request.task_prompt,
llm=llm,
use_vision=chat_request.use_vision,
enable_memory=False,
browser_profile=browser_profile,
)
history = await agent.run(max_steps=chat_request.agent_max_steps)

return BrowserUseResponse(
is_success=history.is_successful() if history.is_successful() is not None else False,
model=chat_request.llm_model,
task_prompt=chat_request.task_prompt,
use_vision=chat_request.use_vision,
agent_researched_urls=history.urls(),
agent_actions=history.action_names(),
agent_durations=round(history.total_duration_seconds(), 3),
agent_steps=history.number_of_steps(),
final_result=history.final_result() if history.is_successful() else f"Task failed: {history.errors()}",
)


if __name__ == "__main__":
opea_microservices["opea_service@browser_use_agent"].start()
94 changes: 94 additions & 0 deletions BrowserUseAgent/docker_compose/intel/hpu/gaudi/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
# Example BrowserUseAgent deployments on an Intel® Gaudi® Platform

This example covers the single-node on-premises deployment of the BrowserUseAgent example using OPEA components. This example begins with a Quick Start section and then documents how to modify deployments, leverage new models and configure the number of allocated devices.

**Note** This example requires access to a properly installed Intel® Gaudi® platform with a functional Docker service configured to use the habanalabs-container-runtime. Please consult the [Intel® Gaudi® software Installation Guide](https://docs.habana.ai/en/v1.20.1/Installation_Guide/Driver_Installation.html) for more information.

## Quick Start Deployment

This section describes how to quickly deploy and test the BrowserUseAgent service manually on an Intel® Gaudi® platform. The basic steps are:

1. [Access the Code](#access-the-code)
2. [Generate a HuggingFace Access Token](#generate-a-huggingface-access-token)
3. [Configure the Deployment Environment](#configure-the-deployment-environment)
4. [Deploy the Services Using Docker Compose](#deploy-the-services-using-docker-compose)
5. [Check the Deployment Status](#check-the-deployment-status)
6. [Test the Pipeline](#test-the-pipeline)
7. [Cleanup the Deployment](#cleanup-the-deployment)

### Access the Code

Clone the GenAIExample repository and access the BrowserUseAgent Intel® Gaudi® platform Docker Compose files and supporting scripts:

```bash
git clone https://github.com/opea-project/GenAIExamples.git
cd GenAIExamples/BrowserUseAgent/docker_compose/intel/hpu/gaudi/
```

Checkout a released version, such as v1.5:

```bash
git checkout v1.5
```

### Generate a HuggingFace Access Token

Some HuggingFace resources, such as some models, are only accessible if you have an access token. If you do not already have a HuggingFace access token, you can create one by first creating an account by following the steps provided at [HuggingFace](https://huggingface.co/) and then generating a [user access token](https://huggingface.co/docs/transformers.js/en/guides/private#step-1-generating-a-user-access-token).

### Configure the Deployment Environment

To set up environment variables for deploying BrowserUseAgent services, source the _setup_env.sh_ script in this directory:

```bash
source ./set_env.sh
```

The _set_env.sh_ script will prompt for required and optional environment variables used to configure the BrowserUseAgent services. If a value is not entered, the script will use a default value for the same. Users need to check if the values fit your deployment environment.

### Deploy the Services Using Docker Compose

To deploy the BrowserUseAgent services, execute the `docker compose up` command with the appropriate arguments. For a default deployment, execute:

```bash
docker compose up -d
```

The BrowserUseAgent docker images should automatically be downloaded from the `OPEA registry` and deployed on the Intel® Gaudi® Platform.

### Check the Deployment Status

After running docker compose, check if all the containers launched via docker compose have started:

```bash
docker ps -a
```

For the default deployment, the following 10 containers should have started:

```
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
96cb590c749c opea/browser-use-agent:latest "python browser_use_…" 9 seconds ago Up 8 seconds 0.0.0.0:8022->8022/tcp, :::8022->8022/tcp browser-use-agent-server
8072e1c33a4b opea/vllm-gaudi:latest "python3 -m vllm.ent…" 9 seconds ago Up 8 seconds (health: starting) 0.0.0.0:8008->80/tcp, [::]:8008->80/tcp vllm-gaudi-server
```

### Test the Pipeline

If you don't have existing websites to test, follow the [guide](./../../../../tests/webarena/README.md) to deploy one in your local environment.

Once the BrowserUseAgent services are running, test the pipeline using the following command:

```bash
curl -X POST http://${host_ip}:${BROWSER_USE_AGENT_PORT}/v1/browser_use_agent \
-H "Content-Type: application/json" \
-d '{"task_prompt": "Navigate to http://10.7.4.57:8083/admin and login with the credentials: username: admin, password: admin1234. Then, find out What are the top-2 best-selling product in 2022?"}'
```

- Note that Update the `task_prompt` to match the evaluation question relevant to your configured website.

### Cleanup the Deployment

To stop the containers associated with the deployment, execute the following command:

```bash
docker compose -f compose.yaml down
```
50 changes: 50 additions & 0 deletions BrowserUseAgent/docker_compose/intel/hpu/gaudi/compose.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

x-common-environment:
&common-env
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}

services:
vllm-gaudi-server:
image: ${REGISTRY:-opea}/vllm-gaudi:${TAG:-1.22.0}
container_name: vllm-gaudi-server
ports:
- ${LLM_ENDPOINT_PORT:-8008}:80
volumes:
- "${DATA_PATH:-./data}:/data"
environment:
<<: *common-env
HF_TOKEN: ${HF_TOKEN}
HF_HOME: /data
HABANA_VISIBLE_DEVICES: all
OMPI_MCA_btl_vader_single_copy_mechanism: none
LLM_MODEL_ID: ${LLM_MODEL_ID}
VLLM_TORCH_PROFILER_DIR: "/mnt"
VLLM_SKIP_WARMUP: true
PT_HPU_ENABLE_LAZY_COLLECTIVES: true
runtime: habana
cap_add:
- SYS_NICE
ipc: host
healthcheck:
test: ["CMD-SHELL", "curl -f http://localhost:80/health || exit 1"]
interval: 10s
timeout: 10s
retries: 150
command: --model $LLM_MODEL_ID --tensor-parallel-size $NUM_CARDS --host 0.0.0.0 --port 80 --max-seq-len-to-capture $MAX_TOTAL_TOKENS

browser-use-agent-server:
image: ${REGISTRY:-opea}/browser-use-agent:${TAG:-latest}
container_name: browser-use-agent-server
depends_on:
- vllm-gaudi-server
ports:
- ${BROWSER_USE_AGENT_PORT:-8022}:8022
environment:
<<: *common-env
LLM_ENDPOINT: ${LLM_ENDPOINT-http://0.0.0.0:8008}
LLM_MODEL: ${LLM_MODEL_ID-Qwen/Qwen2-VL-72B-Instruct}
ipc: host
38 changes: 38 additions & 0 deletions BrowserUseAgent/docker_compose/intel/hpu/gaudi/set_env.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
#!/usr/bin/env bash

# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

# Navigate to the parent directory and source the environment
SCRIPT_DIR=$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" &> /dev/null && pwd)

pushd "$SCRIPT_DIR/../../../../../" > /dev/null
source .set_env.sh
popd > /dev/null

# Function to check if a variable is set
check_var() {
if [ "$#" -ne 1 ]; then
echo "Error: Usage: check_var <ENV_VARIABLE_NAME>" >&2
return 2
fi

local var_name="$1"
if [ -n "${!var_name}" ]; then
# Variable value is non-empty
return 0
else
# Variable is unset or set to an empty string
return 1
fi
}

check_var "HF_TOKEN"
export ip_address=$(hostname -I | awk '{print $1}')

export LLM_ENDPOINT_PORT="${LLM_ENDPOINT_PORT:-8008}"
export LLM_ENDPOINT="http://${ip_address}:${LLM_ENDPOINT_PORT}"
export DATA_PATH="${DATA_PATH-"./data"}"
export LLM_MODEL_ID="${LLM_MODEL_ID-"Qwen/Qwen2.5-VL-32B-Instruct"}"
export MAX_TOTAL_TOKENS="${MAX_TOTAL_TOKENS-12288}"
export NUM_CARDS="${NUM_CARDS-4}"
22 changes: 22 additions & 0 deletions BrowserUseAgent/docker_image_build/build.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

services:
browser-use-agent:
build:
args:
IMAGE_REPO: ${REGISTRY}
BASE_TAG: ${TAG}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
no_proxy: ${no_proxy}
context: ../
dockerfile: ./Dockerfile
image: ${REGISTRY:-opea}/browser-use-agent:${TAG:-latest}

vllm-gaudi:
build:
context: vllm-fork
dockerfile: ./docker/Dockerfile.hpu
extends: browser-use-agent
image: ${REGISTRY:-opea}/vllm-gaudi:${TAG:-latest}
1 change: 1 addition & 0 deletions BrowserUseAgent/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
browser-use==0.3.2
Empty file.
Loading