-
Notifications
You must be signed in to change notification settings - Fork 338
Add new DocIndexRetriever example #405
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 5 commits
Commits
Show all changes
12 commits
Select commit
Hold shift + click to select a range
efa8275
Add DocIndexRetriever example
xuechendi 6a8d9f9
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 3a1e208
update test file name
xuechendi 484a033
add local_build in tests
xuechendi ec64b28
Merge branch 'main' into DocIndexRetriever
XuhuiRen 8544e22
rebase to main and remove dependency to PR314 in UT
xuechendi ee56907
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 6d1b11d
Fix code scan issue
xuechendi 0464e99
2nd fix
xuechendi 1eb8c4d
add git_status and remove xeon UT
xuechendi 809ea1e
update UT
xuechendi 770e7b2
update port since 8000 is not available
xuechendi File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,8 @@ | ||
| # DocRetriever Application | ||
|
|
||
| DocRetriever are the most widely adopted use case for leveraging the different methodologies to match user query against a set of free-text records. DocRetriever is essential to RAG system, which bridges the knowledge gap by dynamically fetching relevant information from external sources, ensuring that responses generated remain factual and current. The core of this architecture are vector databases, which are instrumental in enabling efficient and semantic retrieval of information. These databases store data as vectors, allowing RAG to swiftly access the most pertinent documents or data points based on semantic similarity. | ||
|
|
||
| ## We provided DocWriterAgent with different deployment infra | ||
|
|
||
| - [docker xeon version](docker/xeon/) => minimum endpoints, easy to setup | ||
| - [docker gaudi version](docker/gaudi/) => with extra tei_gaudi endpoint, faster | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,31 @@ | ||
| # Copyright (C) 2024 Intel Corporation | ||
| # SPDX-License-Identifier: Apache-2.0 | ||
|
|
||
| FROM python:3.11-slim | ||
|
|
||
| COPY GenAIComps /home/user/GenAIComps | ||
|
|
||
| RUN apt-get update -y && apt-get install -y --no-install-recommends --fix-missing \ | ||
| libgl1-mesa-glx \ | ||
| libjemalloc-dev \ | ||
| vim \ | ||
| git | ||
|
|
||
| RUN useradd -m -s /bin/bash user && \ | ||
| mkdir -p /home/user && \ | ||
| chown -R user /home/user/ | ||
|
|
||
| RUN cd /home/user/ | ||
|
|
||
| RUN cd /home/user/GenAIComps && pip install --no-cache-dir --upgrade pip && \ | ||
| pip install -r /home/user/GenAIComps/requirements.txt | ||
|
|
||
| COPY GenAIExamples/DocIndexRetriever/docker/retrieval_tool.py /home/user/retrieval_tool.py | ||
|
|
||
| ENV PYTHONPATH=$PYTHONPATH:/home/user/GenAIComps | ||
|
|
||
| USER user | ||
|
|
||
| WORKDIR /home/user | ||
|
|
||
| ENTRYPOINT ["python", "retrieval_tool.py"] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,126 @@ | ||
| # DocRetriever Application | ||
|
|
||
| DocRetriever are the most widely adopted use case for leveraging the different methodologies to match user query against a set of free-text records. DocRetriever is essential to RAG system, which bridges the knowledge gap by dynamically fetching relevant information from external sources, ensuring that responses generated remain factual and current. The core of this architecture are vector databases, which are instrumental in enabling efficient and semantic retrieval of information. These databases store data as vectors, allowing RAG to swiftly access the most pertinent documents or data points based on semantic similarity. | ||
|
|
||
| ### 1. Build Images for necessary microservices. (This step will not needed after docker image released) | ||
|
|
||
| - Embedding TEI Image | ||
|
|
||
| ```bash | ||
| git clone https://github.com/opea-project/GenAIComps.git | ||
| cd GenAIComps | ||
| docker build -t opea/embedding-tei:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/embeddings/langchain/docker/Dockerfile . | ||
| ``` | ||
|
|
||
| - Retriever Vector store Image | ||
|
|
||
| ```bash | ||
| docker build -t opea/retriever-redis:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/retrievers/langchain/redis/docker/Dockerfile . | ||
| ``` | ||
|
|
||
| - Rerank TEI Image | ||
|
|
||
| ```bash | ||
| docker build -t opea/reranking-tei:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/reranks/tei/docker/Dockerfile . | ||
| ``` | ||
|
|
||
| - Dataprep Image | ||
|
|
||
| ```bash | ||
| docker build -t opea/dataprep-on-ray-redis:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/dataprep/redis/langchain_ray/docker/Dockerfile . | ||
| ``` | ||
|
|
||
| ### 2. Build Images for MegaService | ||
|
|
||
| ```bash | ||
| cd .. | ||
| git clone https://github.com/opea-project/GenAIExamples.git | ||
| docker build --no-cache -t opea/doc-index-retriever:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f GenAIExamples/DocIndexRetriever/docker/Dockerfile . | ||
| ``` | ||
|
|
||
| ### 3. Start all the services Docker Containers | ||
|
|
||
| ```bash | ||
| export host_ip="YOUR IP ADDR" | ||
| export HUGGINGFACEHUB_API_TOKEN=${your_hf_api_token} | ||
| export EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5" | ||
| export RERANK_MODEL_ID="BAAI/bge-reranker-base" | ||
| export TEI_EMBEDDING_ENDPOINT="http://${host_ip}:8090" | ||
| export TEI_RERANKING_ENDPOINT="http://${host_ip}:8808" | ||
| export TGI_LLM_ENDPOINT="http://${host_ip}:8008" | ||
| export REDIS_URL="redis://${host_ip}:6379" | ||
| export INDEX_NAME="rag-redis" | ||
| export MEGA_SERVICE_HOST_IP=${host_ip} | ||
| export EMBEDDING_SERVICE_HOST_IP=${host_ip} | ||
| export RETRIEVER_SERVICE_HOST_IP=${host_ip} | ||
| export RERANK_SERVICE_HOST_IP=${host_ip} | ||
| export LLM_SERVICE_HOST_IP=${host_ip} | ||
| export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:8000/v1/retrievaltool" | ||
| export DATAPREP_SERVICE_ENDPOINT="http://${host_ip}:6007/v1/dataprep" | ||
| export llm_hardware='xeon' #xeon, xpu, gaudi | ||
| cd GenAIExamples/DocIndexRetriever/docker/${llm_hardware}/ | ||
| docker compose -f docker-compose.yaml up -d | ||
| ``` | ||
|
|
||
| ### 3. Validation | ||
|
|
||
| Add Knowledge Base via HTTP Links: | ||
|
|
||
| ```bash | ||
| curl -X POST "http://${host_ip}:6007/v1/dataprep" \ | ||
| -H "Content-Type: multipart/form-data" \ | ||
| -F 'link_list=["https://opea.dev"]' | ||
|
|
||
| # expected output | ||
| {"status":200,"message":"Data preparation succeeded"} | ||
| ``` | ||
|
|
||
| Retrieval from KnowledgeBase | ||
|
|
||
| ```bash | ||
| curl http://${host_ip}:8889/v1/retrievaltool -X POST -H "Content-Type: application/json" -d '{ | ||
| "text": "Explain the OPEA project?" | ||
| }' | ||
|
|
||
| # expected output | ||
| {"id":"354e62c703caac8c547b3061433ec5e8","reranked_docs":[{"id":"06d5a5cefc06cf9a9e0b5fa74a9f233c","text":"Close SearchsearchMenu WikiNewsCommunity Daysx-twitter linkedin github searchStreamlining implementation of enterprise-grade Generative AIEfficiently integrate secure, performant, and cost-effective Generative AI workflows into business value.TODAYOPEA..."}],"initial_query":"Explain the OPEA project?"} | ||
| ``` | ||
|
|
||
| ### 4. Trouble shooting | ||
|
|
||
| 1. check all containers are alive | ||
|
|
||
| ```bash | ||
| # redis vector store | ||
| docker container logs redis-vector-db | ||
| # dataprep to redis microservice, input document files | ||
| docker container logs dataprep-redis-server | ||
|
|
||
| # embedding microservice | ||
| curl http://${host_ip}:6000/v1/embeddings \ | ||
| -X POST \ | ||
| -d '{"text":"Explain the OPEA project"}' \ | ||
| -H 'Content-Type: application/json' > query | ||
| docker container logs embedding-tei-server | ||
|
|
||
| # if you used tei-gaudi | ||
| docker container logs tei-embedding-gaudi-server | ||
|
|
||
| # retriever microservice, input embedding output docs | ||
| curl http://${host_ip}:7000/v1/retrieval \ | ||
| -X POST \ | ||
| -d @query \ | ||
| -H 'Content-Type: application/json' > rerank_query | ||
| docker container logs retriever-redis-server | ||
|
|
||
|
|
||
| # reranking microservice | ||
| curl http://${host_ip}:8000/v1/reranking \ | ||
| -X POST \ | ||
| -d @rerank_query \ | ||
| -H 'Content-Type: application/json' > output | ||
| docker container logs reranking-tei-server | ||
|
|
||
| # megaservice gateway | ||
| docker container logs doc-index-retriever-server | ||
| ``` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,125 @@ | ||
|
|
||
| # Copyright (C) 2024 Intel Corporation | ||
| # SPDX-License-Identifier: Apache-2.0 | ||
|
|
||
| version: "3.8" | ||
|
|
||
| services: | ||
| redis-vector-db: | ||
| image: redis/redis-stack:7.2.0-v9 | ||
| container_name: redis-vector-db | ||
| ports: | ||
| - "6379:6379" | ||
| - "8001:8001" | ||
| dataprep-redis-service: | ||
| image: opea/dataprep-on-ray-redis:latest | ||
| container_name: dataprep-redis-server | ||
| depends_on: | ||
| - redis-vector-db | ||
| ports: | ||
| - "6007:6007" | ||
| environment: | ||
| no_proxy: ${no_proxy} | ||
| http_proxy: ${http_proxy} | ||
| https_proxy: ${https_proxy} | ||
| REDIS_URL: ${REDIS_URL} | ||
| INDEX_NAME: ${INDEX_NAME} | ||
| tei-embedding-service: | ||
| image: ghcr.io/huggingface/tei-gaudi:latest | ||
| container_name: tei-embedding-gaudi-server | ||
| ports: | ||
| - "8090:80" | ||
| volumes: | ||
| - "./data:/data" | ||
| runtime: habana | ||
| cap_add: | ||
| - SYS_NICE | ||
| ipc: host | ||
| environment: | ||
| no_proxy: ${no_proxy} | ||
| http_proxy: ${http_proxy} | ||
| https_proxy: ${https_proxy} | ||
| HABANA_VISIBLE_DEVICES: all | ||
| OMPI_MCA_btl_vader_single_copy_mechanism: none | ||
| MAX_WARMUP_SEQUENCE_LENGTH: 512 | ||
| command: --model-id ${EMBEDDING_MODEL_ID} | ||
| embedding: | ||
| image: opea/embedding-tei:latest | ||
| container_name: embedding-tei-server | ||
| ports: | ||
| - "6000:6000" | ||
| ipc: host | ||
| depends_on: | ||
| - tei-embedding-service | ||
| environment: | ||
| no_proxy: ${no_proxy} | ||
| http_proxy: ${http_proxy} | ||
| https_proxy: ${https_proxy} | ||
| TEI_EMBEDDING_ENDPOINT: ${TEI_EMBEDDING_ENDPOINT} | ||
| LANGCHAIN_API_KEY: ${LANGCHAIN_API_KEY} | ||
| LANGCHAIN_TRACING_V2: ${LANGCHAIN_TRACING_V2} | ||
| LANGCHAIN_PROJECT: "opea-embedding-service" | ||
| restart: unless-stopped | ||
| retriever: | ||
| image: opea/retriever-redis:latest | ||
| container_name: retriever-redis-server | ||
| depends_on: | ||
| - redis-vector-db | ||
| ports: | ||
| - "7000:7000" | ||
| ipc: host | ||
| environment: | ||
| no_proxy: ${no_proxy} | ||
| http_proxy: ${http_proxy} | ||
| https_proxy: ${https_proxy} | ||
| REDIS_URL: ${REDIS_URL} | ||
| INDEX_NAME: ${INDEX_NAME} | ||
| LANGCHAIN_API_KEY: ${LANGCHAIN_API_KEY} | ||
| LANGCHAIN_TRACING_V2: ${LANGCHAIN_TRACING_V2} | ||
| LANGCHAIN_PROJECT: "opea-retriever-service" | ||
| restart: unless-stopped | ||
| reranking: | ||
| image: opea/reranking-tei:latest | ||
| container_name: reranking-tei-server | ||
| ports: | ||
| - "8000:8000" | ||
| ipc: host | ||
| entrypoint: python local_reranking.py | ||
| environment: | ||
| no_proxy: ${no_proxy} | ||
| http_proxy: ${http_proxy} | ||
| https_proxy: ${https_proxy} | ||
| TEI_RERANKING_ENDPOINT: ${TEI_RERANKING_ENDPOINT} | ||
| HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN} | ||
| HF_HUB_DISABLE_PROGRESS_BARS: 1 | ||
| HF_HUB_ENABLE_HF_TRANSFER: 0 | ||
| LANGCHAIN_API_KEY: ${LANGCHAIN_API_KEY} | ||
| LANGCHAIN_TRACING_V2: ${LANGCHAIN_TRACING_V2} | ||
| LANGCHAIN_PROJECT: "opea-reranking-service" | ||
| restart: unless-stopped | ||
| doc-index-retriever-server: | ||
| image: opea/doc-index-retriever:latest | ||
| container_name: doc-index-retriever-server | ||
| depends_on: | ||
| - redis-vector-db | ||
| - tei-embedding-service | ||
| - embedding | ||
| - retriever | ||
| - reranking | ||
| ports: | ||
| - "8889:8889" | ||
| environment: | ||
| - no_proxy=${no_proxy} | ||
| - https_proxy=${https_proxy} | ||
| - http_proxy=${http_proxy} | ||
| - MEGA_SERVICE_HOST_IP=${MEGA_SERVICE_HOST_IP} | ||
| - EMBEDDING_SERVICE_HOST_IP=${EMBEDDING_SERVICE_HOST_IP} | ||
| - RETRIEVER_SERVICE_HOST_IP=${RETRIEVER_SERVICE_HOST_IP} | ||
| - RERANK_SERVICE_HOST_IP=${RERANK_SERVICE_HOST_IP} | ||
| - LLM_SERVICE_HOST_IP=${LLM_SERVICE_HOST_IP} | ||
| ipc: host | ||
| restart: always | ||
|
|
||
| networks: | ||
| default: | ||
| driver: bridge |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,59 @@ | ||
| # Copyright (C) 2024 Intel Corporation | ||
| # SPDX-License-Identifier: Apache-2.0 | ||
|
|
||
| import asyncio | ||
| import os | ||
|
|
||
| from comps import MicroService, RetrievalToolGateway, ServiceOrchestrator, ServiceType | ||
|
|
||
| MEGA_SERVICE_HOST_IP = os.getenv("MEGA_SERVICE_HOST_IP", "0.0.0.0") | ||
| MEGA_SERVICE_PORT = os.getenv("MEGA_SERVICE_PORT", 8889) | ||
| EMBEDDING_SERVICE_HOST_IP = os.getenv("EMBEDDING_SERVICE_HOST_IP", "0.0.0.0") | ||
| EMBEDDING_SERVICE_PORT = os.getenv("EMBEDDING_SERVICE_PORT", 6000) | ||
| RETRIEVER_SERVICE_HOST_IP = os.getenv("RETRIEVER_SERVICE_HOST_IP", "0.0.0.0") | ||
| RETRIEVER_SERVICE_PORT = os.getenv("RETRIEVER_SERVICE_PORT", 7000) | ||
| RERANK_SERVICE_HOST_IP = os.getenv("RERANK_SERVICE_HOST_IP", "0.0.0.0") | ||
| RERANK_SERVICE_PORT = os.getenv("RERANK_SERVICE_PORT", 8000) | ||
|
|
||
|
|
||
| class RetrievalToolService: | ||
| def __init__(self, host="0.0.0.0", port=8000): | ||
| self.host = host | ||
| self.port = port | ||
| self.megaservice = ServiceOrchestrator() | ||
|
|
||
| def add_remote_service(self): | ||
| embedding = MicroService( | ||
| name="embedding", | ||
| host=EMBEDDING_SERVICE_HOST_IP, | ||
| port=EMBEDDING_SERVICE_PORT, | ||
| endpoint="/v1/embeddings", | ||
| use_remote_service=True, | ||
| service_type=ServiceType.EMBEDDING, | ||
| ) | ||
| retriever = MicroService( | ||
| name="retriever", | ||
| host=RETRIEVER_SERVICE_HOST_IP, | ||
| port=RETRIEVER_SERVICE_PORT, | ||
| endpoint="/v1/retrieval", | ||
| use_remote_service=True, | ||
| service_type=ServiceType.RETRIEVER, | ||
| ) | ||
| rerank = MicroService( | ||
| name="rerank", | ||
| host=RERANK_SERVICE_HOST_IP, | ||
| port=RERANK_SERVICE_PORT, | ||
| endpoint="/v1/reranking", | ||
| use_remote_service=True, | ||
| service_type=ServiceType.RERANK, | ||
| ) | ||
|
|
||
| self.megaservice.add(embedding).add(retriever).add(rerank) | ||
| self.megaservice.flow_to(embedding, retriever) | ||
| self.megaservice.flow_to(retriever, rerank) | ||
| self.gateway = RetrievalToolGateway(megaservice=self.megaservice, host="0.0.0.0", port=self.port) | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| chatqna = RetrievalToolService(host=MEGA_SERVICE_HOST_IP, port=MEGA_SERVICE_PORT) | ||
| chatqna.add_remote_service() |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.