Adds audio querying to MultimodalQ&A gateway by mhbuehler · Pull Request #974 · opea-project/GenAIComps

mhbuehler · 2024-12-04T19:00:25Z

Description

Adds ASR endpoint, speech audio processing, prompt construction, and return of decoded audio in response metadata. This goes with GenAIExamples PR: opea-project/GenAIExamples#1225.

Issues

Part of the MultimodalQnA Audio & Image Enhancements RFC

Type of change

New feature (non-breaking change which adds new functionality)

Dependencies

N/A

Tests

Automated tests were added to GenAIExamples.

Signed-off-by: okhleif-IL <omar.khleif@intel.com> * added in audio dict creation Signed-off-by: okhleif-IL <omar.khleif@intel.com> * separated audio from prompt Signed-off-by: okhleif-IL <omar.khleif@intel.com> * added ASR endpoint Signed-off-by: okhleif-IL <omar.khleif@intel.com> * removed ASR endpoints from mm embedding Signed-off-by: okhleif-IL <omar.khleif@intel.com> * edited return logic, fixed function call Signed-off-by: okhleif-IL <omar.khleif@intel.com> * added megaservice to elif Signed-off-by: okhleif-IL <omar.khleif@intel.com> * reworked helper func Signed-off-by: okhleif-IL <omar.khleif@intel.com> * Append audio to prompt Signed-off-by: okhleif-IL <omar.khleif@intel.com> * Reworked handle messages, added metadata Signed-off-by: okhleif-IL <omar.khleif@intel.com> * Moved dictionary logic to right place Signed-off-by: okhleif-IL <omar.khleif@intel.com> * changed logic to rely on message len Signed-off-by: okhleif-IL <omar.khleif@intel.com> * list --> empty str Signed-off-by: okhleif-IL <omar.khleif@intel.com> --------- Signed-off-by: Melanie Buehler <melanie.h.buehler@intel.com> Signed-off-by: okhleif-IL <omar.khleif@intel.com> Signed-off-by: dmsuehir <dina.s.jones@intel.com>

for more information, see https://pre-commit.ci

Signed-off-by: okhleif-IL <omar.khleif@intel.com>

Fixed role bug where enumeration was wrong

for more information, see https://pre-commit.ci

codecov · 2024-12-04T23:23:46Z

Codecov Report

Attention: Patch coverage is 66.66667% with 25 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
comps/cores/mega/gateway.py	66.66%	25 Missing ⚠️

Files with missing lines	Coverage Δ
comps/cores/mega/gateway.py	`31.43% <66.66%> (+3.29%)`	⬆️

comps/cores/mega/gateway.py

ashahba

LGTM!

Signed-off-by: Melanie Buehler <melanie.h.buehler@intel.com>

Adds unit test coverage for audio query

for more information, see https://pre-commit.ci

Signed-off-by: Melanie Buehler <melanie.h.buehler@intel.com>

Fix port number placement

mkbhanda

Looks good to me!

mkbhanda · 2024-12-06T19:13:28Z

comps/cores/mega/gateway.py

            return prompt

+    def convert_audio_to_text(self, audio):
+        # translate audio to text by passing in dictionary to ASR


comment quirky! dictionary is a data type here but can get mixed with the English word dictionary (word meanings)

mkbhanda · 2024-12-06T19:15:23Z

comps/cores/mega/gateway.py

+        else:
+            input_dict = {"byte_str": audio[0]}
+
+        response = requests.post(self.asr_endpoint, data=json.dumps(input_dict), proxies={"http": None})


should proxies be read from some environment variable for a more general solution?

Why this is setting proxies in the first place, shouldn't those be set well before this point?

eero-t · 2024-12-09T12:23:50Z

tests/cores/mega/test_multimodalqna_gateway.py

 import requests
 from fastapi import Request

+os.environ["ASR_SERVICE_PORT"] = "8086"


Why this overrides environment, instead of taking the value from environment?

ashahba · 2024-12-10T01:04:54Z

Labeling this as WIP since we may not need it after all.

ashahba · 2024-12-10T23:41:31Z

Looking at new changes in GenAIExamples, it seems like we don't this PR after all since this PR opea-project/GenAIExamples#1225 is self contained.

mhbuehler requested a review from lvliang-intel as a code owner December 4, 2024 19:00

mhbuehler and others added 3 commits December 4, 2024 11:01

Merge branch 'main' into mmqna-audio-query

e1e5fde

[pre-commit.ci] auto fixes from pre-commit.com hooks

70c54e1

for more information, see https://pre-commit.ci

fixed role bug where i never was > 0

6a71843

Signed-off-by: okhleif-IL <omar.khleif@intel.com>

mhbuehler mentioned this pull request Dec 4, 2024

Adds audio querying to MultimodalQ&A Example opea-project/GenAIExamples#1225

Merged

1 task

okhleif-10 and others added 3 commits December 4, 2024 15:14

removed whitespace

615459b

Signed-off-by: okhleif-IL <omar.khleif@intel.com>

Merge pull request #13 from mhbuehler/omar/role-debug

1753473

Fixed role bug where enumeration was wrong

[pre-commit.ci] auto fixes from pre-commit.com hooks

dcafe8d

for more information, see https://pre-commit.ci

ashahba added WIP r1.2 labels Dec 4, 2024

ashahba added this to the v1.2 milestone Dec 4, 2024

ashahba reviewed Dec 5, 2024

View reviewed changes

comps/cores/mega/gateway.py Show resolved Hide resolved

ashahba approved these changes Dec 5, 2024

View reviewed changes

mhbuehler and others added 8 commits December 5, 2024 14:45

Adds unit test coverage for audio query

fa47959

Signed-off-by: Melanie Buehler <melanie.h.buehler@intel.com>

Port number fix

37826be

Signed-off-by: Melanie Buehler <melanie.h.buehler@intel.com>

Formatting

40d34db

Signed-off-by: Melanie Buehler <melanie.h.buehler@intel.com>

Merge pull request #14 from mhbuehler/melanie/add_test_coverage

6f2a753

Adds unit test coverage for audio query

[pre-commit.ci] auto fixes from pre-commit.com hooks

a665c3c

for more information, see https://pre-commit.ci

Merge branch 'main' into mmqna-audio-query

4a5c8ea

Fixed place where port number is set

d9ab567

Signed-off-by: Melanie Buehler <melanie.h.buehler@intel.com>

Merge pull request #15 from mhbuehler/melanie/port_placement

75b135f

Fix port number placement

ashahba removed the WIP label Dec 6, 2024

mkbhanda approved these changes Dec 6, 2024

View reviewed changes

eero-t reviewed Dec 9, 2024

View reviewed changes

okhleif-10 mentioned this pull request Dec 9, 2024

Moved Audio Query Gateway changes to multimodalqna.py mhbuehler/GenAIExamples#29

Merged

4 tasks

ashahba added the WIP label Dec 10, 2024

ashahba closed this Dec 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds audio querying to MultimodalQ&A gateway#974

Adds audio querying to MultimodalQ&A gateway#974
mhbuehler wants to merge 15 commits intoopea-project:mainfrom
mhbuehler:mmqna-audio-query

mhbuehler commented Dec 4, 2024 •

edited

Loading

Uh oh!

codecov bot commented Dec 4, 2024 •

edited

Loading

Uh oh!

Uh oh!

ashahba left a comment

Uh oh!

mkbhanda left a comment

Uh oh!

mkbhanda Dec 6, 2024

Uh oh!

mkbhanda Dec 6, 2024

Uh oh!

eero-t Dec 9, 2024

Uh oh!

eero-t Dec 9, 2024

Uh oh!

ashahba commented Dec 10, 2024

Uh oh!

ashahba commented Dec 10, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

mhbuehler commented Dec 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Issues

Type of change

Dependencies

Tests

Uh oh!

codecov bot commented Dec 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

ashahba left a comment

Choose a reason for hiding this comment

Uh oh!

mkbhanda left a comment

Choose a reason for hiding this comment

Uh oh!

mkbhanda Dec 6, 2024

Choose a reason for hiding this comment

Uh oh!

mkbhanda Dec 6, 2024

Choose a reason for hiding this comment

Uh oh!

eero-t Dec 9, 2024

Choose a reason for hiding this comment

Uh oh!

eero-t Dec 9, 2024

Choose a reason for hiding this comment

Uh oh!

ashahba commented Dec 10, 2024

Uh oh!

ashahba commented Dec 10, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

mhbuehler commented Dec 4, 2024 •

edited

Loading

codecov bot commented Dec 4, 2024 •

edited

Loading