Add image query support to the backend microservices#12
Add image query support to the backend microservices#12dmsuehir merged 39 commits intommqna-image-queryfrom
Conversation
Signed-off-by: dmsuehir <dina.s.jones@intel.com>
Signed-off-by: dmsuehir <dina.s.jones@intel.com>
Signed-off-by: dmsuehir <dina.s.jones@intel.com>
Signed-off-by: dmsuehir <dina.s.jones@intel.com>
Signed-off-by: dmsuehir <dina.s.jones@intel.com>
Signed-off-by: okhleif-IL <omar.khleif@intel.com> * added in audio dict creation Signed-off-by: okhleif-IL <omar.khleif@intel.com> * separated audio from prompt Signed-off-by: okhleif-IL <omar.khleif@intel.com> * added ASR endpoint Signed-off-by: okhleif-IL <omar.khleif@intel.com> * removed ASR endpoints from mm embedding Signed-off-by: okhleif-IL <omar.khleif@intel.com> * edited return logic, fixed function call Signed-off-by: okhleif-IL <omar.khleif@intel.com> * added megaservice to elif Signed-off-by: okhleif-IL <omar.khleif@intel.com> * reworked helper func Signed-off-by: okhleif-IL <omar.khleif@intel.com> * Append audio to prompt Signed-off-by: okhleif-IL <omar.khleif@intel.com> * Reworked handle messages, added metadata Signed-off-by: okhleif-IL <omar.khleif@intel.com> * Moved dictionary logic to right place Signed-off-by: okhleif-IL <omar.khleif@intel.com> * changed logic to rely on message len Signed-off-by: okhleif-IL <omar.khleif@intel.com> * list --> empty str Signed-off-by: okhleif-IL <omar.khleif@intel.com> --------- Signed-off-by: Melanie Buehler <melanie.h.buehler@intel.com> Signed-off-by: okhleif-IL <omar.khleif@intel.com> Signed-off-by: dmsuehir <dina.s.jones@intel.com>
…ina/image_query Signed-off-by: dmsuehir <dina.s.jones@intel.com>
for more information, see https://pre-commit.ci
Signed-off-by: okhleif-IL <omar.khleif@intel.com>
Signed-off-by: dmsuehir <dina.s.jones@intel.com>
Signed-off-by: okhleif-IL <omar.khleif@intel.com>
Fixed role bug where enumeration was wrong
for more information, see https://pre-commit.ci
Signed-off-by: dmsuehir <dina.s.jones@intel.com>
…nto dina/image_query Signed-off-by: dmsuehir <dina.s.jones@intel.com>
Signed-off-by: dmsuehir <dina.s.jones@intel.com>
Signed-off-by: Melanie Buehler <melanie.h.buehler@intel.com>
Signed-off-by: dmsuehir <dina.s.jones@intel.com>
Signed-off-by: dmsuehir <dina.s.jones@intel.com>
| else: | ||
| print("request is from user.") | ||
| text = req_dict["prompt"] | ||
| text = f"<image>\nUSER: {text}\nASSISTANT:" |
There was a problem hiding this comment.
It used to be that the LVM microservice would always add <image>\n and USER: to the beginning of the prompt that was created by the gateway, however I had to change that since images can now to scattered throughout the conversation (we can no longer just attach the single image to the first message in the conversation). I updated this mock of the LVM microservice to do something similar to what the actual LVM service does where it checks how many image tags already exist in the prompt, and adds extras if they are needed. The reason why we might need extras are when the retriever also gets an image from the vector store.
| ) | ||
| # print(result_dict) | ||
| self.assertEqual(result_dict[self.lvm.name]["text"], "<image>\nUSER: chao, \nASSISTANT:") | ||
| self.assertEqual(result_dict[self.lvm.name]["text"], "USER: <image>\nchao, \nASSISTANT:") |
There was a problem hiding this comment.
Previously, the prompts had <image>\nUSER: but the prompt format documented by huggingface is the other way around USER: <image>\n and also since we will now have images interleaved in the conversation, I changed it to match the HF docs.
| formats than the default LLaVA 1.5 model. | ||
| """ | ||
|
|
||
| # Models to test and their expected prompts |
There was a problem hiding this comment.
Normally I would've parameterized this test with the pytest decorator, however it seems like that won't work in this case because of them subclassing unittest.IsolatedAsyncioTestCase. Apparently there's another library called parameterized that could do it, but the unittest container that this is run in doesn't have that dependency and it's easier to avoid adding a third party dependency (stackoverflow reference for the issue).
So, instead I just have lists of the models and their expected prompt format, and I'm looping those to do the checks.
Signed-off-by: Melanie Buehler <melanie.h.buehler@intel.com>
Signed-off-by: Melanie Buehler <melanie.h.buehler@intel.com>
Adds unit test coverage for audio query
for more information, see https://pre-commit.ci
Signed-off-by: Melanie Buehler <melanie.h.buehler@intel.com>
Fix port number placement
okhleif-10
left a comment
There was a problem hiding this comment.
LGTM, do all tests pass?
| # Multimodal RAG QnA With Videos has not yet accepts image as input during QnA. | ||
| num_messages = len(data["messages"]) if isinstance(data["messages"], list) else 1 | ||
|
|
||
| # Multimodal RAG QnA With Videos has not yet accepts image as input during QnA. |
There was a problem hiding this comment.
Can you remove this comment or update it for accuracy?
| self.assertEqual(len(b64_types["image"]), 2) | ||
| finally: | ||
| test_gateway.stop() | ||
|
|
Signed-off-by: dmsuehir <dina.s.jones@intel.com>
Signed-off-by: dmsuehir <dina.s.jones@intel.com>
Signed-off-by: dmsuehir <dina.s.jones@intel.com>
Signed-off-by: dmsuehir <dina.s.jones@intel.com>
…mples Signed-off-by: dmsuehir <dina.s.jones@intel.com>
|
I rebased and synced this branch with |
|
|
||
| if echo "$CONTENT" | grep -q "retrieved_docs"; then | ||
| echo "[ retriever ] Content has retrieved_docs as expected." | ||
| if echo "$CONTENT" | grep -q "retrieved_docs"; then |
There was a problem hiding this comment.
Is this grep supposed to look for img_b64_str?
There was a problem hiding this comment.
Yes, good point. I will fix this.
Signed-off-by: dmsuehir <dina.s.jones@intel.com>
| logflag = os.getenv("LOGFLAG", False) | ||
|
|
||
| # The maximum number of images that should be sent to the LVM | ||
| max_images = int(os.getenv("MAX_IMAGES", 1)) |
There was a problem hiding this comment.
In this line, is 1 being set manually? Or is it a default?
There was a problem hiding this comment.
This means that if MAX_IMAGES is unset, it will default to 1
Description
Updates the following microservices:
Issues
https://github.com/opea-project/docs/blob/main/community/rfcs/24-10-02-GenAIExamples-001-Image_and_Audio_Support_in_MultimodalQnA.md
Type of change
List the type of change like below. Please delete options that are not relevant.
Dependencies
No new dependencies
Tests
Tests have been updated