Skip to content
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
Show all changes
55 commits
Select commit Hold shift + click to select a range
d87c288
enable streaming of multiple chunks
Jun 20, 2025
e5432bb
removed whitespace
Jun 20, 2025
a13e1d4
readme update
Jun 20, 2025
42b57fb
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jun 20, 2025
e04febb
readme fix
Jun 20, 2025
ce89a31
Merge branch 'denvr_chat' of
Jun 20, 2025
4329ea2
added docsum compose remote
Jun 23, 2025
faf4810
docsum readme update
Jun 24, 2025
e640a71
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jun 24, 2025
6df5d64
codegen changes
Jun 26, 2025
53fc0f1
Merge branch 'denvr_chat' of https://github.com/srinarayan-srikanthan…
Jun 26, 2025
e33cf84
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jun 26, 2025
f125973
removed comments
Jun 26, 2025
7a0cdf7
Merge branch 'denvr_chat' of https://github.com/srinarayan-srikanthan…
Jun 26, 2025
96ac978
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jun 26, 2025
3908dc3
updated codegen readme
Jun 27, 2025
3a323bb
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jun 27, 2025
4f9e097
modify env variable
Jun 30, 2025
c349e5f
Merge branch 'denvr_chat' of https://github.com/srinarayan-srikanthan…
Jun 30, 2025
1d9c25f
remove dependancy
Jun 30, 2025
7c59b54
added env variable
Jul 1, 2025
bff287b
updated env variable names
Jul 2, 2025
0882a24
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jul 2, 2025
7d391b4
modified compose
Jul 3, 2025
cb9f50c
Merge branch 'denvr_chat' of https://github.com/srinarayan-srikanthan…
Jul 3, 2025
227fba1
modify variable for endpoint
Jul 3, 2025
03968c3
prod suite ui update
Jul 3, 2025
6e8a0c4
Merge branch 'main' into denvr_chat
srinarayan-srikanthan Jul 3, 2025
dfee2e7
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jul 3, 2025
1a7889a
Merge branch 'main' into denvr_chat
srinarayan-srikanthan Jul 18, 2025
ea19ce4
ui changes
Jul 24, 2025
43b4d5c
ui changes remove extra files
Jul 24, 2025
6811299
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jul 24, 2025
6fddf20
docsum ui bug fix
Jul 24, 2025
4993856
Merge branch 'main' into denvr_chat
srinarayan-srikanthan Jul 24, 2025
e42b881
Merge branch 'denvr_chat' of https://github.com/srinarayan-srikanthan…
Jul 24, 2025
e34dd5b
Merge branch 'main' into denvr_chat
srinarayan-srikanthan Jul 28, 2025
30ce8d6
added test
Jul 29, 2025
02fe611
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jul 29, 2025
d2b9871
test scripts
Jul 30, 2025
341a7a6
Merge branch 'denvr_chat' of https://github.com/srinarayan-srikanthan…
Jul 30, 2025
0940a40
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jul 30, 2025
e686789
ui bug fix
Jul 30, 2025
955dc1e
Merge branch 'main' into denvr_chat
srinarayan-srikanthan Jul 30, 2025
4a2981e
fix typo
Jul 30, 2025
e05de6b
refactor code
Jul 31, 2025
bb58621
removed test scripts
Jul 31, 2025
7757e79
rollback compose.yaml fix
Jul 31, 2025
c0f1e3f
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jul 31, 2025
2bc3d1c
rollback setenv issue
Jul 31, 2025
497dc62
Merge branch 'denvr_chat' of https://github.com/srinarayan-srikanthan…
Jul 31, 2025
4abde42
Merge branch 'main' into denvr_chat
srinarayan-srikanthan Aug 1, 2025
d541880
update readme
Aug 1, 2025
c987f17
Merge branch 'denvr_chat' of https://github.com/srinarayan-srikanthan…
Aug 1, 2025
6d2ac74
Merge branch 'main' into denvr_chat
srinarayan-srikanthan Aug 4, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 17 additions & 19 deletions ChatQnA/chatqna.py
Original file line number Diff line number Diff line change
Expand Up @@ -175,25 +175,23 @@ def align_generator(self, gen, **kwargs):
# b'data:{"id":"","object":"text_completion","created":1725530204,"model":"meta-llama/Meta-Llama-3-8B-Instruct","system_fingerprint":"2.0.1-native","choices":[{"index":0,"delta":{"role":"assistant","content":"?"},"logprobs":null,"finish_reason":null}]}\n\n'
for line in gen:
line = line.decode("utf-8")
start = line.find("{")
end = line.rfind("}") + 1

json_str = line[start:end]
try:
# sometimes yield empty chunk, do a fallback here
json_data = json.loads(json_str)
if "ops" in json_data and "op" in json_data["ops"][0]:
if "value" in json_data["ops"][0] and isinstance(json_data["ops"][0]["value"], str):
yield f"data: {repr(json_data['ops'][0]['value'].encode('utf-8'))}\n\n"
else:
pass
elif (
json_data["choices"][0]["finish_reason"] != "eos_token"
and "content" in json_data["choices"][0]["delta"]
):
yield f"data: {repr(json_data['choices'][0]['delta']['content'].encode('utf-8'))}\n\n"
except Exception as e:
yield f"data: {repr(json_str.encode('utf-8'))}\n\n"
chunks = [chunk.strip() for chunk in line.split("\n\n") if chunk.strip()]
for line in chunks:
start = line.find("{")
end = line.rfind("}") + 1
json_str = line[start:end]
Comment on lines +179 to +182
Copy link

Copilot AI Jun 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Consider renaming the inner loop variable (e.g., to 'chunk') to avoid shadowing the outer 'line' variable, which can improve code clarity.

Copilot uses AI. Check for mistakes.
try:
# sometimes yield empty chunk, do a fallback here
json_data = json.loads(json_str)
if "ops" in json_data and "op" in json_data["ops"][0]:
if "value" in json_data["ops"][0] and isinstance(json_data["ops"][0]["value"], str):
yield f"data: {repr(json_data['ops'][0]['value'].encode('utf-8'))}\n\n"
else:
pass
elif "content" in json_data["choices"][0]["delta"]:
yield f"data: {repr(json_data['choices'][0]['delta']['content'].encode('utf-8'))}\n\n"
except Exception as e:
yield f"data: {repr(json_str.encode('utf-8'))}\n\n"
yield "data: [DONE]\n\n"


Expand Down
12 changes: 12 additions & 0 deletions ChatQnA/docker_compose/intel/cpu/xeon/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,17 @@ CPU example with Open Telemetry feature:
docker compose -f compose.yaml -f compose.telemetry.yaml up -d
```

To deploy ChatQnA services with remote endpoints, set the required environment variables mentioned below and run the 'compose_remote.yaml' file.

**Note**: Set REMOTE_ENDPOINT variable value to "https://api.inference.denvrdata.com" when the remote endpoint to access is "https://api.inference.denvrdata.com/v1/chat/completions"

```bash
export REMOTE_ENDPOINT=<endpoint-url>
export LLM_MODEL_ID=<model-id>
export OPENAI_API_KEY=<API-KEY>
docker compose -f compose_remote.yaml up -d
```

**Note**: developers should build docker image from source when:

- Developing off the git main branch (as the container's ports in the repo may be different from the published docker image).
Expand Down Expand Up @@ -147,6 +158,7 @@ In the context of deploying a ChatQnA pipeline on an Intel® Xeon® platform, we
| File | Description |
| ------------------------------------------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| [compose.yaml](./compose.yaml) | Default compose file using vllm as serving framework and redis as vector database |
| [compose_remote.yaml](./compose_remote.yaml) | Default compose file using remote inference endpoints and redis as vector database |
| [compose_milvus.yaml](./compose_milvus.yaml) | Uses Milvus as the vector database. All other configurations remain the same as the default |
| [compose_pinecone.yaml](./compose_pinecone.yaml) | Uses Pinecone as the vector database. All other configurations remain the same as the default. For more details, refer to [README_pinecone.md](./README_pinecone.md). |
| [compose_qdrant.yaml](./compose_qdrant.yaml) | Uses Qdrant as the vector database. All other configurations remain the same as the default. For more details, refer to [README_qdrant.md](./README_qdrant.md). |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,7 @@ services:
- RERANK_SERVER_HOST_IP=tei-reranking-service
- RERANK_SERVER_PORT=${RERANK_SERVER_PORT:-80}
- LLM_SERVER_HOST_IP=${REMOTE_ENDPOINT}
- OPENAI_API_KEY= ${OPENAI_API_KEY}
- OPENAI_API_KEY=${OPENAI_API_KEY}
- LLM_SERVER_PORT=80
- LLM_MODEL=${LLM_MODEL_ID}
- LOGFLAG=${LOGFLAG}
Expand Down
Loading