Skip to content
Open
Show file tree
Hide file tree
Changes from 9 commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
824732b
Add HybridSearch sample solution
arpannookala-12 Mar 10, 2026
30c6b7b
Fix README repo URL, model config, and add required models section
arpannookala-12 Mar 27, 2026
d7c6ae9
Fix docker compose command and add per-service log instructions
arpannookala-12 Mar 27, 2026
e1336f3
Add per-model APISIX gateway endpoint support
arpannookala-12 Mar 27, 2026
c7cb649
Fix reranker endpoint for Gaudi TEI and improve reranker-configuratio…
arpannookala-12 Apr 1, 2026
962d962
Support dual reranker backends (Keycloak/APISIX + GenAI Gateway)
arpannookala-12 Apr 2, 2026
e312590
Scope reranker config to GenAI Gateway only and simplify payload
arpannookala-12 Apr 3, 2026
e992dd7
Narrow reranker config scope to GenAI Gateway + Xeon and note Keycloa…
arpannookala-12 Apr 3, 2026
96f460e
Add INFERENCE_BACKEND flag to support Gaudi TEI and Xeon vLLM
arpannookala-12 Apr 3, 2026
c3e92ef
Add INFERENCE_BACKEND note to README model config section
arpannookala-12 Apr 3, 2026
40b1272
Fix LLM /v1 path for Keycloak+Gaudi: LLM is always vLLM, not TEI
arpannookala-12 Apr 3, 2026
5524187
Fix reranker batching and token overflow for large document uploads
arpannookala-12 Apr 6, 2026
828721e
Document Xeon + Keycloak model endpoints with -vllmcpu suffix
arpannookala-12 Apr 6, 2026
1fc68ca
Address PR review comments: embedding batch size, payload routing, do…
arpannookala-12 Apr 7, 2026
392acfb
Fix reranker-configuration.md BASE_URL: revert /v1 from base URL
arpannookala-12 Apr 7, 2026
33f85a1
Add .venv-dataset to bandit exclude_dirs in .bandit config
arpannookala-12 Apr 7, 2026
7c2aaba
Add SDLE security scan workflow for HybridSearch
arpannookala-12 Apr 7, 2026
a6de873
Revert "Add .venv-dataset to bandit exclude_dirs in .bandit config"
arpannookala-12 Apr 7, 2026
892fb8e
Move code-scans.yaml to repo root .github/workflows
arpannookala-12 Apr 7, 2026
a373f75
Remove code-scans.yaml after security scans passed
arpannookala-12 Apr 7, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
78 changes: 78 additions & 0 deletions sample_solutions/HybridSearch/.env.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
DEPLOYMENT_PHASE=production
SYSTEM_MODE=document

# Local URL Endpoint (only needed for non-public domains)
# If using a local domain like api.example.com mapped to localhost:
# Set this to: api.example.com (domain without https://)
# If using a public domain, set any placeholder value like: not-needed
LOCAL_URL_ENDPOINT=not-needed

# Service Ports
GATEWAY_PORT=8000
EMBEDDING_PORT=8001
RETRIEVAL_PORT=8002
LLM_PORT=8003
INGESTION_PORT=8004
UI_PORT=8501

# Inference Gateway Configuration
# GENAI_GATEWAY_URL: Base URL to your inference gateway (without /v1 suffix)
# - For GenAI Gateway: https://genai-gateway.example.com
# - For APISIX Gateway: https://apisix-gateway.example.com
GENAI_GATEWAY_URL=https://api.example.com

# GENAI_API_KEY: Authentication token/API key for the inference gateway
# - For GenAI Gateway: Your GenAI Gateway API key (litellm_master_key from vault.yml)
# To generate: use the generate-vault-secrets.sh script
# - For APISIX Gateway: Your APISIX authentication token
# To generate: use the generate-token.sh script (Keycloak client credentials)
GENAI_API_KEY=your-pre-generated-token-here

# Model Configuration
# IMPORTANT: Use the full model names as they appear in your inference service
# Check available models: curl https://your-gateway-url/v1/models -H "Authorization: Bearer your-token"
EMBEDDING_MODEL_ENDPOINT=BAAI/bge-base-en-v1.5
EMBEDDING_MODEL_NAME=BAAI/bge-base-en-v1.5
RERANKER_MODEL_ENDPOINT=BAAI/bge-reranker-base
RERANKER_MODEL_NAME=BAAI/bge-reranker-base
LLM_MODEL_ENDPOINT=Qwen/Qwen3-4B-Instruct-2507
LLM_MODEL_NAME=Qwen/Qwen3-4B-Instruct-2507

# Inference Backend Type
# Set to "tei" for Gaudi hardware (TEI serves at /embeddings and /rerank — no /v1 prefix)
# Set to "vllm" for Xeon hardware (vLLM serves at /v1/embeddings and /v1/rerank)
INFERENCE_BACKEND=vllm

# APISIX Gateway Per-Model Endpoints
# When using APISIX, uncomment and set these to the full URL with APISIX route path.
# Each model has its own APISIX route. Use exact route paths from your deployment.
# Example routes: /bge-base-en-v1.5/*, /bge-reranker-base/*, /Qwen3-4B-Instruct-2507/*
# EMBEDDING_API_ENDPOINT=https://api.example.com/bge-base-en-v1.5
# RERANKER_API_ENDPOINT=https://api.example.com/bge-reranker-base
# LLM_API_ENDPOINT=https://api.example.com/Qwen3-4B-Instruct-2507

# Retrieval Configuration
USE_RERANKING=true
TOP_K_DENSE=100
TOP_K_SPARSE=100
TOP_K_FUSION=50
TOP_K_RERANK=10
RRF_K=60

# Ingestion Configuration
CHUNK_SIZE=512
CHUNK_OVERLAP=50
MAX_FILE_SIZE_MB=100
SUPPORTED_FORMATS=pdf,docx,xlsx,ppt,txt

# UI Configuration
UI_TITLE=InsightMapper Lite
UI_PAGE_ICON=
UI_LAYOUT=wide

# Logging
LOG_LEVEL=INFO

# SSL Verification Settings
# Set to false only for dev with self-signed certs
VERIFY_SSL=true
15 changes: 15 additions & 0 deletions sample_solutions/HybridSearch/.gitattributes
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# Git attributes for hybrid-search project
*.py text eol=lf
*.md text eol=lf
*.txt text eol=lf
*.yml text eol=lf
*.yaml text eol=lf
*.json text eol=lf
*.toml text eol=lf
*.sh text eol=lf
Dockerfile text eol=lf
.env* text eol=lf
*.pkl binary
*.bin binary
*.db binary
*.pdf binary
99 changes: 99 additions & 0 deletions sample_solutions/HybridSearch/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
# Python
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST
*.pyc

# Virtual Environment
venv/
ENV/
env/
.venv

# Environment variables
.env
.env.local
.env.*.local
*.bak
.env.bak

# IDE
.vscode/
.idea/
*.swp
*.swo
*~
.DS_Store

# Data directories
data/documents/*
data/indexes/*
data/*.db
data/*.sqlite
!data/documents/.gitkeep
!data/indexes/.gitkeep

# Ingestion service data (documents, indexes, metadata)
api/ingestion/data/documents/*
api/ingestion/data/indexes/*
api/ingestion/data/*.db
api/ingestion/data/*.sqlite
!api/ingestion/data/documents/.gitkeep
!api/ingestion/data/indexes/.gitkeep

# Model weights
models/
*.bin
*.pt
*.pth
*.onnx
*.safetensors

# Logs
logs/
*.log

# Jupyter Notebook
.ipynb_checkpoints

# Docker
*.tar
.dockerignore

# Testing
.pytest_cache/
.coverage
htmlcov/
.tox/
.hypothesis/

# Monitoring
monitoring/data/

# Temporary files
tmp/
temp/
*.tmp

# API Keys (extra safety)
**/api_key*
**/secret*
**/*secret*
Loading