-
Notifications
You must be signed in to change notification settings - Fork 25
cld2labs/HybridSearch #74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
arpannookala-12
wants to merge
20
commits into
opea-project:main
Choose a base branch
from
cld2labs:cld2labs/HybridSearch
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 9 commits
Commits
Show all changes
20 commits
Select commit
Hold shift + click to select a range
824732b
Add HybridSearch sample solution
arpannookala-12 30c6b7b
Fix README repo URL, model config, and add required models section
arpannookala-12 d7c6ae9
Fix docker compose command and add per-service log instructions
arpannookala-12 e1336f3
Add per-model APISIX gateway endpoint support
arpannookala-12 c7cb649
Fix reranker endpoint for Gaudi TEI and improve reranker-configuratio…
arpannookala-12 962d962
Support dual reranker backends (Keycloak/APISIX + GenAI Gateway)
arpannookala-12 e312590
Scope reranker config to GenAI Gateway only and simplify payload
arpannookala-12 e992dd7
Narrow reranker config scope to GenAI Gateway + Xeon and note Keycloa…
arpannookala-12 96f460e
Add INFERENCE_BACKEND flag to support Gaudi TEI and Xeon vLLM
arpannookala-12 c3e92ef
Add INFERENCE_BACKEND note to README model config section
arpannookala-12 40b1272
Fix LLM /v1 path for Keycloak+Gaudi: LLM is always vLLM, not TEI
arpannookala-12 5524187
Fix reranker batching and token overflow for large document uploads
arpannookala-12 828721e
Document Xeon + Keycloak model endpoints with -vllmcpu suffix
arpannookala-12 1fc68ca
Address PR review comments: embedding batch size, payload routing, do…
arpannookala-12 392acfb
Fix reranker-configuration.md BASE_URL: revert /v1 from base URL
arpannookala-12 33f85a1
Add .venv-dataset to bandit exclude_dirs in .bandit config
arpannookala-12 7c2aaba
Add SDLE security scan workflow for HybridSearch
arpannookala-12 a6de873
Revert "Add .venv-dataset to bandit exclude_dirs in .bandit config"
arpannookala-12 892fb8e
Move code-scans.yaml to repo root .github/workflows
arpannookala-12 a373f75
Remove code-scans.yaml after security scans passed
arpannookala-12 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,78 @@ | ||
| DEPLOYMENT_PHASE=production | ||
| SYSTEM_MODE=document | ||
|
|
||
| # Local URL Endpoint (only needed for non-public domains) | ||
| # If using a local domain like api.example.com mapped to localhost: | ||
| # Set this to: api.example.com (domain without https://) | ||
| # If using a public domain, set any placeholder value like: not-needed | ||
| LOCAL_URL_ENDPOINT=not-needed | ||
|
|
||
| # Service Ports | ||
| GATEWAY_PORT=8000 | ||
| EMBEDDING_PORT=8001 | ||
| RETRIEVAL_PORT=8002 | ||
| LLM_PORT=8003 | ||
| INGESTION_PORT=8004 | ||
| UI_PORT=8501 | ||
|
|
||
| # Inference Gateway Configuration | ||
| # GENAI_GATEWAY_URL: Base URL to your inference gateway (without /v1 suffix) | ||
| # - For GenAI Gateway: https://genai-gateway.example.com | ||
| # - For APISIX Gateway: https://apisix-gateway.example.com | ||
| GENAI_GATEWAY_URL=https://api.example.com | ||
|
|
||
| # GENAI_API_KEY: Authentication token/API key for the inference gateway | ||
| # - For GenAI Gateway: Your GenAI Gateway API key (litellm_master_key from vault.yml) | ||
| # To generate: use the generate-vault-secrets.sh script | ||
| # - For APISIX Gateway: Your APISIX authentication token | ||
| # To generate: use the generate-token.sh script (Keycloak client credentials) | ||
| GENAI_API_KEY=your-pre-generated-token-here | ||
|
|
||
| # Model Configuration | ||
| # IMPORTANT: Use the full model names as they appear in your inference service | ||
| # Check available models: curl https://your-gateway-url/v1/models -H "Authorization: Bearer your-token" | ||
| EMBEDDING_MODEL_ENDPOINT=BAAI/bge-base-en-v1.5 | ||
| EMBEDDING_MODEL_NAME=BAAI/bge-base-en-v1.5 | ||
| RERANKER_MODEL_ENDPOINT=BAAI/bge-reranker-base | ||
| RERANKER_MODEL_NAME=BAAI/bge-reranker-base | ||
| LLM_MODEL_ENDPOINT=Qwen/Qwen3-4B-Instruct-2507 | ||
| LLM_MODEL_NAME=Qwen/Qwen3-4B-Instruct-2507 | ||
|
|
||
| # Inference Backend Type | ||
| # Set to "tei" for Gaudi hardware (TEI serves at /embeddings and /rerank — no /v1 prefix) | ||
| # Set to "vllm" for Xeon hardware (vLLM serves at /v1/embeddings and /v1/rerank) | ||
| INFERENCE_BACKEND=vllm | ||
|
|
||
| # APISIX Gateway Per-Model Endpoints | ||
| # When using APISIX, uncomment and set these to the full URL with APISIX route path. | ||
| # Each model has its own APISIX route. Use exact route paths from your deployment. | ||
| # Example routes: /bge-base-en-v1.5/*, /bge-reranker-base/*, /Qwen3-4B-Instruct-2507/* | ||
| # EMBEDDING_API_ENDPOINT=https://api.example.com/bge-base-en-v1.5 | ||
| # RERANKER_API_ENDPOINT=https://api.example.com/bge-reranker-base | ||
| # LLM_API_ENDPOINT=https://api.example.com/Qwen3-4B-Instruct-2507 | ||
|
|
||
| # Retrieval Configuration | ||
| USE_RERANKING=true | ||
alexsin368 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| TOP_K_DENSE=100 | ||
| TOP_K_SPARSE=100 | ||
| TOP_K_FUSION=50 | ||
| TOP_K_RERANK=10 | ||
| RRF_K=60 | ||
|
|
||
| # Ingestion Configuration | ||
| CHUNK_SIZE=512 | ||
| CHUNK_OVERLAP=50 | ||
| MAX_FILE_SIZE_MB=100 | ||
| SUPPORTED_FORMATS=pdf,docx,xlsx,ppt,txt | ||
|
|
||
| # UI Configuration | ||
| UI_TITLE=InsightMapper Lite | ||
| UI_PAGE_ICON= | ||
| UI_LAYOUT=wide | ||
|
|
||
| # Logging | ||
| LOG_LEVEL=INFO | ||
|
|
||
| # SSL Verification Settings | ||
| # Set to false only for dev with self-signed certs | ||
| VERIFY_SSL=true | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,15 @@ | ||
| # Git attributes for hybrid-search project | ||
| *.py text eol=lf | ||
| *.md text eol=lf | ||
| *.txt text eol=lf | ||
| *.yml text eol=lf | ||
| *.yaml text eol=lf | ||
| *.json text eol=lf | ||
| *.toml text eol=lf | ||
| *.sh text eol=lf | ||
| Dockerfile text eol=lf | ||
| .env* text eol=lf | ||
| *.pkl binary | ||
| *.bin binary | ||
| *.db binary | ||
| *.pdf binary |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,99 @@ | ||
| # Python | ||
| __pycache__/ | ||
| *.py[cod] | ||
| *$py.class | ||
| *.so | ||
| .Python | ||
| build/ | ||
| develop-eggs/ | ||
| dist/ | ||
| downloads/ | ||
| eggs/ | ||
| .eggs/ | ||
| lib/ | ||
| lib64/ | ||
| parts/ | ||
| sdist/ | ||
| var/ | ||
| wheels/ | ||
| *.egg-info/ | ||
| .installed.cfg | ||
| *.egg | ||
| MANIFEST | ||
| *.pyc | ||
|
|
||
| # Virtual Environment | ||
| venv/ | ||
| ENV/ | ||
| env/ | ||
| .venv | ||
|
|
||
| # Environment variables | ||
| .env | ||
| .env.local | ||
| .env.*.local | ||
| *.bak | ||
| .env.bak | ||
|
|
||
| # IDE | ||
| .vscode/ | ||
| .idea/ | ||
| *.swp | ||
| *.swo | ||
| *~ | ||
| .DS_Store | ||
|
|
||
| # Data directories | ||
| data/documents/* | ||
| data/indexes/* | ||
| data/*.db | ||
| data/*.sqlite | ||
| !data/documents/.gitkeep | ||
| !data/indexes/.gitkeep | ||
|
|
||
| # Ingestion service data (documents, indexes, metadata) | ||
| api/ingestion/data/documents/* | ||
| api/ingestion/data/indexes/* | ||
| api/ingestion/data/*.db | ||
| api/ingestion/data/*.sqlite | ||
| !api/ingestion/data/documents/.gitkeep | ||
| !api/ingestion/data/indexes/.gitkeep | ||
|
|
||
| # Model weights | ||
| models/ | ||
| *.bin | ||
| *.pt | ||
| *.pth | ||
| *.onnx | ||
| *.safetensors | ||
|
|
||
| # Logs | ||
| logs/ | ||
| *.log | ||
|
|
||
| # Jupyter Notebook | ||
| .ipynb_checkpoints | ||
|
|
||
| # Docker | ||
| *.tar | ||
| .dockerignore | ||
|
|
||
| # Testing | ||
| .pytest_cache/ | ||
| .coverage | ||
| htmlcov/ | ||
| .tox/ | ||
| .hypothesis/ | ||
|
|
||
| # Monitoring | ||
| monitoring/data/ | ||
|
|
||
| # Temporary files | ||
| tmp/ | ||
| temp/ | ||
| *.tmp | ||
|
|
||
| # API Keys (extra safety) | ||
| **/api_key* | ||
| **/secret* | ||
| **/*secret* |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.