new: DataprepRequest model#1525
Conversation
for more information, see https://pre-commit.ci
Codecov ReportAttention: Patch coverage is
🚀 New features to boost your workflow:
|
|
Looking for some assistance regarding the CI failures - what changes are we missing? it seems as if the models are not being recognized as valid input |
|
Hi @aMahanna , thanks for your great contribution!
class DataprepRequest:
def __init__(
self,
db_type: str = Form(None),
files: Optional[Union[UploadFile, List[UploadFile]]] = File(None),
link_list: Optional[str] = Form(None),
xx: str = Form(None)
):
self.db_type = db_type
self.files = files
self.link_list = link_list
self.xx = xxRedis class should be like: class RedisDataprepRequest(DataprepRequest):
def __init__(
index_name: Optional[str] = Form(None),
**kwargs
):
super().__init__(**kwargs)
self.index_name = index_name
async def ingest_files(
base: Annotated[Optional[DataprepRequest], Depends()] = None,
redis: Annotated[Optional[RedisDataprepRequest], Depends()] = None,
neo4j: Annotated[Optional[Neo4jDataprepRequest], Depends()] = None,
):
request = redis or neo4j or base
if request is None:
raise HTTPException(400, detail="Invalid request")
# DO STH... |
|
Thanks for you reply. I'm confused why this is required for Dataprep if the Retriever supports the usage of the @register_microservice(
name="opea_service@retrievers",
service_type=ServiceType.RETRIEVER,
endpoint="/v1/retrieval",
host="0.0.0.0",
port=7000,
)
@register_statistics(names=["opea_service@retrievers"])
async def retrieve_docs(
input: Union[EmbedDoc, EmbedMultimodalDoc, RetrievalRequest, ChatCompletionRequest]
) -> Union[SearchedDoc, SearchedMultimodalDoc, RetrievalResponse, ChatCompletionRequest]:Can you please clarify? |
|
@aMahanna from the links in @letonghan answer I think is a parsing issue with different requests types. In dataprep because we do file ingestion it uses multipart/form-data and retriever uses application/json. curl -X POST curl http://${your_ip}:7000/v1/retrieval |
|
Ah, understood. Thank you @rbrugaro and @letonghan. Addressed here: 7eabcfa Follow-up question to @letonghan:
This has been added, but what exactly would be the purpose of now having a i.e is |
@aMahanna your are right, the |
Signed-off-by: letonghan <letong.han@intel.com>
for more information, see https://pre-commit.ci
Signed-off-by: letonghan <letong.han@intel.com>
Signed-off-by: letonghan <letong.han@intel.com>
Signed-off-by: letonghan <letong.han@intel.com>
Signed-off-by: letonghan <letong.han@intel.com>
Signed-off-by: letonghan <letong.han@intel.com>
|
thank you! I don't have permission to merge, so feel free to do so |
|
@aMahanna let me revert the merge, i just noticed quadrant milvus didn't pass? i thought i saw all passing. |
No, it's all pass with the second retry test, https://github.com/opea-project/GenAIComps/actions/runs/14517733895, what you see not pass is the results from the first round. |
* new: `DataprepRequest` * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix: docstrings * rem: `ingest_from_graphDB` * new: dep injection * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix: verbose `input` processing * attempt: replace `kwargs` with params * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rem: `db_type` ref: opea-project#1525 (comment) * attempt: require `base` * Revert "attempt: require `base`" This reverts commit 620ca6b. * new: `DataprepRequest` * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix: docstrings * rem: `ingest_from_graphDB` * new: dep injection * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix: verbose `input` processing * attempt: replace `kwargs` with params * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rem: `db_type` ref: opea-project#1525 (comment) * attempt: require `base` * Revert "attempt: require `base`" This reverts commit 620ca6b. * Fix dataprep request class issue of Redis (#1) * new: `DataprepRequest` * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix: docstrings * rem: `ingest_from_graphDB` * new: dep injection * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix: verbose `input` processing * attempt: replace `kwargs` with params * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rem: `db_type` ref: opea-project#1525 (comment) * attempt: require `base` * Revert "attempt: require `base`" This reverts commit 620ca6b. * fix dataprep request class of redis Signed-off-by: letonghan <letong.han@intel.com> * revert change in redis.py Signed-off-by: letonghan <letong.han@intel.com> --------- Signed-off-by: letonghan <letong.han@intel.com> Co-authored-by: Anthony Mahanna <anthony.mahanna@arangodb.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Anthony Mahanna <43019056+aMahanna@users.noreply.github.com> Co-authored-by: Liang Lv <liang1.lv@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert: `DataprepRequest` for multimodal * revert: `DataprepRequest` for multimodal (PT2) * fix: conditionally fetch unique `DataprepRequest` attributes * fix bugs in dataprep util script Signed-off-by: letonghan <letong.han@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert change of pgvector Signed-off-by: letonghan <letong.han@intel.com> * fix indices bug for redis Signed-off-by: letonghan <letong.han@intel.com> * minor fix for redis Signed-off-by: letonghan <letong.han@intel.com> * ingest file into rag_redis_test Signed-off-by: letonghan <letong.han@intel.com> * update indice name Signed-off-by: letonghan <letong.han@intel.com> --------- Signed-off-by: letonghan <letong.han@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Liang Lv <liang1.lv@intel.com> Co-authored-by: Letong Han <106566639+letonghan@users.noreply.github.com> Co-authored-by: letonghan <letong.han@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com>
* Fix image build issue (#1553) Signed-off-by: chensuyue <suyue.chen@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Unified default port number for the same service in text2graph and text2sql (#1554) Signed-off-by: Yao, Qing <qing.yao@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * new: `OpeaArangoDataprep` (#2) * new: `third_parties/arangodb` * new: `OpeaArangoDataprep` * cleanup * fix: `vllm` instead of `tgi` * fix: dataprep compsoe * cleanup Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * new: `OpeaArangoRetriever` (#3) * new: `OpeaArangoRetriever` * cleanup Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * new: deps Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * fix typo: `test_retrievers_arango.sh` Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * updated retriever-arango compose file Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * correction Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * add json-repair to dataprep-arango requirements Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Fix network error, change WORKPATH Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * extra time for health check retriever Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * extended retriever healthcheck 90secs Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * correction Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Update arangodb.py Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Removing hugging face token requirement from test file Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Update test_dataprep_arango with network tests and additional logs Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Running CI after docker rate limit Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Base case remove HF_token, no additional tests Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Adding VLLM check and logs, currently VLLM not working in CI/CD Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * cleanup: compose.yaml Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * update: arangodb healthcheck Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * cleanup Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * cleanup: retriever test Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * fix: typo Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * rem: unused vars Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * fix: indent Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * temp: swap vllm healthcheck with sleep Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * fix: typo Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * fix: component name typo Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * fix: support `EmbedDoc` for retriever Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * fix: `getattr` Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * fix: CURL command Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * revert 6061484 Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Update xtune file and change DDP paramter (#1552) Signed-off-by: jilongwa <jilong.wang@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * add N/A option (#1561) Signed-off-by: ZhangJianyu <zhang.jianyu@outlook.com> Co-authored-by: ZhangJianyu <zhang.jianyu@outlook.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Test latest gaudi docker container (#1477) Update base gaudi container into the latest version, docker pull vault.habana.ai/gaudi-docker/1.20.1/ubuntu22.04/habanalabs/pytorch-installer-2.6.0:latest, https://docs.habana.ai/en/latest/Installation_Guide/Additional_Installation/Docker_Installation.html#use-intel-gaudi-containers Signed-off-by: chensuyue <suyue.chen@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * fix audioqna male voice setting (#1559) Co-authored-by: Letong Han <106566639+letonghan@users.noreply.github.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * added error handling for lvm (#1556) Signed-off-by: okhleif-IL <omar.khleif@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * enable mysql db for sql agent (#1431) Signed-off-by: cheehook <chee.hoo.kok@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Enlarge DocSum prompt buffer (#1567) * Enlarge DocSum prompt buffer Follow PR #1471 Signed-off-by: XinyaoWa <xinyao.wang@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Update vLLM parameter max-seq-len-to-capture (#1565) Signed-off-by: lvliang-intel <liang1.lv@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * fix: lint Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * fix: missing import Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * new: healtcheck for dataprep-arangodb Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * update: arangodb readmes Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * cleanup: test_dataprep_arango.sh Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * cleanup: test_dataprep_arango.sh (PT2) Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * cleanup: test_dataprep_arango.sh (PT3) Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * update: test_dataprep_arango.sh Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * fix: whitespace Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Remove Transformers versions from requirements.txt file (#1547) * Remove Transformers versions from requirements.txt file Signed-off-by: Abolfazl Shahbazi <12436063+ashahba@users.noreply.github.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Remove index_names from files for dataprep-get request (#1569) * remove index_names from files fot get request Signed-off-by: Mustafa <mustafa.cetin@intel.com> * update the tests Signed-off-by: Mustafa <mustafa.cetin@intel.com> * update the tests Signed-off-by: Mustafa <mustafa.cetin@intel.com> * update the tests Signed-off-by: Mustafa <mustafa.cetin@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add validation check for 'all' as an index_name Signed-off-by: Mustafa <mustafa.cetin@intel.com> * fix for readme file Signed-off-by: Mustafa <mustafa.cetin@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mustafa <mustafa.cetin@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Abolfazl Shahbazi <12436063+ashahba@users.noreply.github.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Upgrade Optimum Habana version to fix security check issue (#1571) Signed-off-by: lvliang-intel <liang1.lv@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Make llamaguard compatible with both TGI and vLLM (#1581) Signed-off-by: lvliang-intel <liang1.lv@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Fix Dockerfile error and add CI test for IPEX (#1585) * Fix Dockerfile error and add CI teat Signed-off-by: lvliang-intel <liang1.lv@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Reduce multilang tts docker image size (#1587) * fix audioqna male voice setting * reduce multilang tts docker image size Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * unset OPENAI_KEY in CI test (#1586) Signed-off-by: Rita Brugarolas <rita.brugarolas.brufau@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Add AWS Credentials for CD test (#1588) * Fix CD test issue Signed-off-by: ZePan110 <ze.pan@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * update: shorten ingest_dataprep.txt Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * revert: a4d943e Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * new: `DataprepRequest` model (#1525) * new: `DataprepRequest` * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix: docstrings * rem: `ingest_from_graphDB` * new: dep injection * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix: verbose `input` processing * attempt: replace `kwargs` with params * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rem: `db_type` ref: #1525 (comment) * attempt: require `base` * Revert "attempt: require `base`" This reverts commit 620ca6b. * new: `DataprepRequest` * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix: docstrings * rem: `ingest_from_graphDB` * new: dep injection * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix: verbose `input` processing * attempt: replace `kwargs` with params * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rem: `db_type` ref: #1525 (comment) * attempt: require `base` * Revert "attempt: require `base`" This reverts commit 620ca6b. * Fix dataprep request class issue of Redis (#1) * new: `DataprepRequest` * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix: docstrings * rem: `ingest_from_graphDB` * new: dep injection * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix: verbose `input` processing * attempt: replace `kwargs` with params * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rem: `db_type` ref: #1525 (comment) * attempt: require `base` * Revert "attempt: require `base`" This reverts commit 620ca6b. * fix dataprep request class of redis Signed-off-by: letonghan <letong.han@intel.com> * revert change in redis.py Signed-off-by: letonghan <letong.han@intel.com> --------- Signed-off-by: letonghan <letong.han@intel.com> Co-authored-by: Anthony Mahanna <anthony.mahanna@arangodb.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Anthony Mahanna <43019056+aMahanna@users.noreply.github.com> Co-authored-by: Liang Lv <liang1.lv@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert: `DataprepRequest` for multimodal * revert: `DataprepRequest` for multimodal (PT2) * fix: conditionally fetch unique `DataprepRequest` attributes * fix bugs in dataprep util script Signed-off-by: letonghan <letong.han@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert change of pgvector Signed-off-by: letonghan <letong.han@intel.com> * fix indices bug for redis Signed-off-by: letonghan <letong.han@intel.com> * minor fix for redis Signed-off-by: letonghan <letong.han@intel.com> * ingest file into rag_redis_test Signed-off-by: letonghan <letong.han@intel.com> * update indice name Signed-off-by: letonghan <letong.han@intel.com> --------- Signed-off-by: letonghan <letong.han@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Liang Lv <liang1.lv@intel.com> Co-authored-by: Letong Han <106566639+letonghan@users.noreply.github.com> Co-authored-by: letonghan <letong.han@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * revert: bc4445c Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * revert: d17f6aa Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Revert "new: `DataprepRequest` model (#1525)" (#1592) This reverts commit 88947ab. Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * add hyperlinks Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * revert: 4eb9ec4f Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * new: ArangoDBDataprepRequest Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * fix: lint Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * cleanup: delete_files Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * remove: env mutation Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * fix: move openai key env var to top of file Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> --------- Signed-off-by: chensuyue <suyue.chen@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> Signed-off-by: Yao, Qing <qing.yao@intel.com> Signed-off-by: jilongwa <jilong.wang@intel.com> Signed-off-by: ZhangJianyu <zhang.jianyu@outlook.com> Signed-off-by: okhleif-IL <omar.khleif@intel.com> Signed-off-by: cheehook <chee.hoo.kok@intel.com> Signed-off-by: XinyaoWa <xinyao.wang@intel.com> Signed-off-by: lvliang-intel <liang1.lv@intel.com> Signed-off-by: Abolfazl Shahbazi <12436063+ashahba@users.noreply.github.com> Signed-off-by: Mustafa <mustafa.cetin@intel.com> Signed-off-by: Rita Brugarolas <rita.brugarolas.brufau@intel.com> Signed-off-by: ZePan110 <ze.pan@intel.com> Signed-off-by: letonghan <letong.han@intel.com> Co-authored-by: chen, suyue <suyue.chen@intel.com> Co-authored-by: Yao Qing <Qing.Yao@intel.com> Co-authored-by: lasyasn <lasyan640@gmail.com> Co-authored-by: Ajay Kallepalli <ajay.r.kallepalli@gmail.com> Co-authored-by: jilongW <109333127+jilongW@users.noreply.github.com> Co-authored-by: Neo Zhang Jianyu <jianyu.zhang@intel.com> Co-authored-by: ZhangJianyu <zhang.jianyu@outlook.com> Co-authored-by: Spycsh <39623753+Spycsh@users.noreply.github.com> Co-authored-by: Letong Han <106566639+letonghan@users.noreply.github.com> Co-authored-by: Omar Khleif <omar.khleif@intel.com> Co-authored-by: cheehook <chee.hoo.kok@intel.com> Co-authored-by: XinyaoWa <xinyao.wang@intel.com> Co-authored-by: Liang Lv <liang1.lv@intel.com> Co-authored-by: Abolfazl Shahbazi <12436063+ashahba@users.noreply.github.com> Co-authored-by: Mustafa <109312699+MSCetin37@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: rbrugaro <rita.brugarolas.brufau@intel.com> Co-authored-by: ZePan110 <ze.pan@intel.com> Co-authored-by: letonghan <letong.han@intel.com>
* new: `DataprepRequest` * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix: docstrings * rem: `ingest_from_graphDB` * new: dep injection * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix: verbose `input` processing * attempt: replace `kwargs` with params * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rem: `db_type` ref: opea-project#1525 (comment) * attempt: require `base` * Revert "attempt: require `base`" This reverts commit 620ca6b. * new: `DataprepRequest` * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix: docstrings * rem: `ingest_from_graphDB` * new: dep injection * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix: verbose `input` processing * attempt: replace `kwargs` with params * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rem: `db_type` ref: opea-project#1525 (comment) * attempt: require `base` * Revert "attempt: require `base`" This reverts commit 620ca6b. * Fix dataprep request class issue of Redis (opea-project#1) * new: `DataprepRequest` * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix: docstrings * rem: `ingest_from_graphDB` * new: dep injection * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix: verbose `input` processing * attempt: replace `kwargs` with params * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rem: `db_type` ref: opea-project#1525 (comment) * attempt: require `base` * Revert "attempt: require `base`" This reverts commit 620ca6b. * fix dataprep request class of redis Signed-off-by: letonghan <letong.han@intel.com> * revert change in redis.py Signed-off-by: letonghan <letong.han@intel.com> --------- Signed-off-by: letonghan <letong.han@intel.com> Co-authored-by: Anthony Mahanna <anthony.mahanna@arangodb.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Anthony Mahanna <43019056+aMahanna@users.noreply.github.com> Co-authored-by: Liang Lv <liang1.lv@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert: `DataprepRequest` for multimodal * revert: `DataprepRequest` for multimodal (PT2) * fix: conditionally fetch unique `DataprepRequest` attributes * fix bugs in dataprep util script Signed-off-by: letonghan <letong.han@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert change of pgvector Signed-off-by: letonghan <letong.han@intel.com> * fix indices bug for redis Signed-off-by: letonghan <letong.han@intel.com> * minor fix for redis Signed-off-by: letonghan <letong.han@intel.com> * ingest file into rag_redis_test Signed-off-by: letonghan <letong.han@intel.com> * update indice name Signed-off-by: letonghan <letong.han@intel.com> --------- Signed-off-by: letonghan <letong.han@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Liang Lv <liang1.lv@intel.com> Co-authored-by: Letong Han <106566639+letonghan@users.noreply.github.com> Co-authored-by: letonghan <letong.han@intel.com> Signed-off-by: Chingis Yundunov <c.yundunov@datamonsters.com>
) * Fix image build issue (opea-project#1553) Signed-off-by: chensuyue <suyue.chen@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Unified default port number for the same service in text2graph and text2sql (opea-project#1554) Signed-off-by: Yao, Qing <qing.yao@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * new: `OpeaArangoDataprep` (opea-project#2) * new: `third_parties/arangodb` * new: `OpeaArangoDataprep` * cleanup * fix: `vllm` instead of `tgi` * fix: dataprep compsoe * cleanup Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * new: `OpeaArangoRetriever` (opea-project#3) * new: `OpeaArangoRetriever` * cleanup Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * new: deps Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * fix typo: `test_retrievers_arango.sh` Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * updated retriever-arango compose file Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * correction Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * add json-repair to dataprep-arango requirements Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Fix network error, change WORKPATH Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * extra time for health check retriever Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * extended retriever healthcheck 90secs Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * correction Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Update arangodb.py Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Removing hugging face token requirement from test file Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Update test_dataprep_arango with network tests and additional logs Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Running CI after docker rate limit Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Base case remove HF_token, no additional tests Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Adding VLLM check and logs, currently VLLM not working in CI/CD Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * cleanup: compose.yaml Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * update: arangodb healthcheck Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * cleanup Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * cleanup: retriever test Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * fix: typo Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * rem: unused vars Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * fix: indent Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * temp: swap vllm healthcheck with sleep Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * fix: typo Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * fix: component name typo Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * fix: support `EmbedDoc` for retriever Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * fix: `getattr` Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * fix: CURL command Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * revert 6061484 Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Update xtune file and change DDP paramter (opea-project#1552) Signed-off-by: jilongwa <jilong.wang@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * add N/A option (opea-project#1561) Signed-off-by: ZhangJianyu <zhang.jianyu@outlook.com> Co-authored-by: ZhangJianyu <zhang.jianyu@outlook.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Test latest gaudi docker container (opea-project#1477) Update base gaudi container into the latest version, docker pull vault.habana.ai/gaudi-docker/1.20.1/ubuntu22.04/habanalabs/pytorch-installer-2.6.0:latest, https://docs.habana.ai/en/latest/Installation_Guide/Additional_Installation/Docker_Installation.html#use-intel-gaudi-containers Signed-off-by: chensuyue <suyue.chen@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * fix audioqna male voice setting (opea-project#1559) Co-authored-by: Letong Han <106566639+letonghan@users.noreply.github.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * added error handling for lvm (opea-project#1556) Signed-off-by: okhleif-IL <omar.khleif@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * enable mysql db for sql agent (opea-project#1431) Signed-off-by: cheehook <chee.hoo.kok@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Enlarge DocSum prompt buffer (opea-project#1567) * Enlarge DocSum prompt buffer Follow PR opea-project#1471 Signed-off-by: XinyaoWa <xinyao.wang@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Update vLLM parameter max-seq-len-to-capture (opea-project#1565) Signed-off-by: lvliang-intel <liang1.lv@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * fix: lint Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * fix: missing import Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * new: healtcheck for dataprep-arangodb Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * update: arangodb readmes Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * cleanup: test_dataprep_arango.sh Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * cleanup: test_dataprep_arango.sh (PT2) Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * cleanup: test_dataprep_arango.sh (PT3) Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * update: test_dataprep_arango.sh Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * fix: whitespace Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Remove Transformers versions from requirements.txt file (opea-project#1547) * Remove Transformers versions from requirements.txt file Signed-off-by: Abolfazl Shahbazi <12436063+ashahba@users.noreply.github.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Remove index_names from files for dataprep-get request (opea-project#1569) * remove index_names from files fot get request Signed-off-by: Mustafa <mustafa.cetin@intel.com> * update the tests Signed-off-by: Mustafa <mustafa.cetin@intel.com> * update the tests Signed-off-by: Mustafa <mustafa.cetin@intel.com> * update the tests Signed-off-by: Mustafa <mustafa.cetin@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add validation check for 'all' as an index_name Signed-off-by: Mustafa <mustafa.cetin@intel.com> * fix for readme file Signed-off-by: Mustafa <mustafa.cetin@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mustafa <mustafa.cetin@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Abolfazl Shahbazi <12436063+ashahba@users.noreply.github.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Upgrade Optimum Habana version to fix security check issue (opea-project#1571) Signed-off-by: lvliang-intel <liang1.lv@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Make llamaguard compatible with both TGI and vLLM (opea-project#1581) Signed-off-by: lvliang-intel <liang1.lv@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Fix Dockerfile error and add CI test for IPEX (opea-project#1585) * Fix Dockerfile error and add CI teat Signed-off-by: lvliang-intel <liang1.lv@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Reduce multilang tts docker image size (opea-project#1587) * fix audioqna male voice setting * reduce multilang tts docker image size Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * unset OPENAI_KEY in CI test (opea-project#1586) Signed-off-by: Rita Brugarolas <rita.brugarolas.brufau@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Add AWS Credentials for CD test (opea-project#1588) * Fix CD test issue Signed-off-by: ZePan110 <ze.pan@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * update: shorten ingest_dataprep.txt Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * revert: a4d943e Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * new: `DataprepRequest` model (opea-project#1525) * new: `DataprepRequest` * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix: docstrings * rem: `ingest_from_graphDB` * new: dep injection * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix: verbose `input` processing * attempt: replace `kwargs` with params * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rem: `db_type` ref: opea-project#1525 (comment) * attempt: require `base` * Revert "attempt: require `base`" This reverts commit 620ca6b. * new: `DataprepRequest` * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix: docstrings * rem: `ingest_from_graphDB` * new: dep injection * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix: verbose `input` processing * attempt: replace `kwargs` with params * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rem: `db_type` ref: opea-project#1525 (comment) * attempt: require `base` * Revert "attempt: require `base`" This reverts commit 620ca6b. * Fix dataprep request class issue of Redis (opea-project#1) * new: `DataprepRequest` * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix: docstrings * rem: `ingest_from_graphDB` * new: dep injection * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix: verbose `input` processing * attempt: replace `kwargs` with params * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rem: `db_type` ref: opea-project#1525 (comment) * attempt: require `base` * Revert "attempt: require `base`" This reverts commit 620ca6b. * fix dataprep request class of redis Signed-off-by: letonghan <letong.han@intel.com> * revert change in redis.py Signed-off-by: letonghan <letong.han@intel.com> --------- Signed-off-by: letonghan <letong.han@intel.com> Co-authored-by: Anthony Mahanna <anthony.mahanna@arangodb.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Anthony Mahanna <43019056+aMahanna@users.noreply.github.com> Co-authored-by: Liang Lv <liang1.lv@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert: `DataprepRequest` for multimodal * revert: `DataprepRequest` for multimodal (PT2) * fix: conditionally fetch unique `DataprepRequest` attributes * fix bugs in dataprep util script Signed-off-by: letonghan <letong.han@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert change of pgvector Signed-off-by: letonghan <letong.han@intel.com> * fix indices bug for redis Signed-off-by: letonghan <letong.han@intel.com> * minor fix for redis Signed-off-by: letonghan <letong.han@intel.com> * ingest file into rag_redis_test Signed-off-by: letonghan <letong.han@intel.com> * update indice name Signed-off-by: letonghan <letong.han@intel.com> --------- Signed-off-by: letonghan <letong.han@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Liang Lv <liang1.lv@intel.com> Co-authored-by: Letong Han <106566639+letonghan@users.noreply.github.com> Co-authored-by: letonghan <letong.han@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * revert: bc4445c Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * revert: d17f6aa Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Revert "new: `DataprepRequest` model (opea-project#1525)" (opea-project#1592) This reverts commit 88947ab. Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * add hyperlinks Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * revert: 4eb9ec4f Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * new: ArangoDBDataprepRequest Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * fix: lint Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * cleanup: delete_files Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * remove: env mutation Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * fix: move openai key env var to top of file Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> --------- Signed-off-by: chensuyue <suyue.chen@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> Signed-off-by: Yao, Qing <qing.yao@intel.com> Signed-off-by: jilongwa <jilong.wang@intel.com> Signed-off-by: ZhangJianyu <zhang.jianyu@outlook.com> Signed-off-by: okhleif-IL <omar.khleif@intel.com> Signed-off-by: cheehook <chee.hoo.kok@intel.com> Signed-off-by: XinyaoWa <xinyao.wang@intel.com> Signed-off-by: lvliang-intel <liang1.lv@intel.com> Signed-off-by: Abolfazl Shahbazi <12436063+ashahba@users.noreply.github.com> Signed-off-by: Mustafa <mustafa.cetin@intel.com> Signed-off-by: Rita Brugarolas <rita.brugarolas.brufau@intel.com> Signed-off-by: ZePan110 <ze.pan@intel.com> Signed-off-by: letonghan <letong.han@intel.com> Co-authored-by: chen, suyue <suyue.chen@intel.com> Co-authored-by: Yao Qing <Qing.Yao@intel.com> Co-authored-by: lasyasn <lasyan640@gmail.com> Co-authored-by: Ajay Kallepalli <ajay.r.kallepalli@gmail.com> Co-authored-by: jilongW <109333127+jilongW@users.noreply.github.com> Co-authored-by: Neo Zhang Jianyu <jianyu.zhang@intel.com> Co-authored-by: ZhangJianyu <zhang.jianyu@outlook.com> Co-authored-by: Spycsh <39623753+Spycsh@users.noreply.github.com> Co-authored-by: Letong Han <106566639+letonghan@users.noreply.github.com> Co-authored-by: Omar Khleif <omar.khleif@intel.com> Co-authored-by: cheehook <chee.hoo.kok@intel.com> Co-authored-by: XinyaoWa <xinyao.wang@intel.com> Co-authored-by: Liang Lv <liang1.lv@intel.com> Co-authored-by: Abolfazl Shahbazi <12436063+ashahba@users.noreply.github.com> Co-authored-by: Mustafa <109312699+MSCetin37@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: rbrugaro <rita.brugarolas.brufau@intel.com> Co-authored-by: ZePan110 <ze.pan@intel.com> Co-authored-by: letonghan <letong.han@intel.com> Signed-off-by: Chingis Yundunov <c.yundunov@datamonsters.com>
* new: `DataprepRequest` * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix: docstrings * rem: `ingest_from_graphDB` * new: dep injection * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix: verbose `input` processing * attempt: replace `kwargs` with params * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rem: `db_type` ref: opea-project#1525 (comment) * attempt: require `base` * Revert "attempt: require `base`" This reverts commit 620ca6b. * new: `DataprepRequest` * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix: docstrings * rem: `ingest_from_graphDB` * new: dep injection * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix: verbose `input` processing * attempt: replace `kwargs` with params * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rem: `db_type` ref: opea-project#1525 (comment) * attempt: require `base` * Revert "attempt: require `base`" This reverts commit 620ca6b. * Fix dataprep request class issue of Redis (opea-project#1) * new: `DataprepRequest` * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix: docstrings * rem: `ingest_from_graphDB` * new: dep injection * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix: verbose `input` processing * attempt: replace `kwargs` with params * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rem: `db_type` ref: opea-project#1525 (comment) * attempt: require `base` * Revert "attempt: require `base`" This reverts commit 620ca6b. * fix dataprep request class of redis Signed-off-by: letonghan <letong.han@intel.com> * revert change in redis.py Signed-off-by: letonghan <letong.han@intel.com> --------- Signed-off-by: letonghan <letong.han@intel.com> Co-authored-by: Anthony Mahanna <anthony.mahanna@arangodb.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Anthony Mahanna <43019056+aMahanna@users.noreply.github.com> Co-authored-by: Liang Lv <liang1.lv@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert: `DataprepRequest` for multimodal * revert: `DataprepRequest` for multimodal (PT2) * fix: conditionally fetch unique `DataprepRequest` attributes * fix bugs in dataprep util script Signed-off-by: letonghan <letong.han@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert change of pgvector Signed-off-by: letonghan <letong.han@intel.com> * fix indices bug for redis Signed-off-by: letonghan <letong.han@intel.com> * minor fix for redis Signed-off-by: letonghan <letong.han@intel.com> * ingest file into rag_redis_test Signed-off-by: letonghan <letong.han@intel.com> * update indice name Signed-off-by: letonghan <letong.han@intel.com> --------- Signed-off-by: letonghan <letong.han@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Liang Lv <liang1.lv@intel.com> Co-authored-by: Letong Han <106566639+letonghan@users.noreply.github.com> Co-authored-by: letonghan <letong.han@intel.com>
) * Fix image build issue (opea-project#1553) Signed-off-by: chensuyue <suyue.chen@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Unified default port number for the same service in text2graph and text2sql (opea-project#1554) Signed-off-by: Yao, Qing <qing.yao@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * new: `OpeaArangoDataprep` (opea-project#2) * new: `third_parties/arangodb` * new: `OpeaArangoDataprep` * cleanup * fix: `vllm` instead of `tgi` * fix: dataprep compsoe * cleanup Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * new: `OpeaArangoRetriever` (opea-project#3) * new: `OpeaArangoRetriever` * cleanup Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * new: deps Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * fix typo: `test_retrievers_arango.sh` Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * updated retriever-arango compose file Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * correction Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * add json-repair to dataprep-arango requirements Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Fix network error, change WORKPATH Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * extra time for health check retriever Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * extended retriever healthcheck 90secs Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * correction Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Update arangodb.py Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Removing hugging face token requirement from test file Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Update test_dataprep_arango with network tests and additional logs Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Running CI after docker rate limit Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Base case remove HF_token, no additional tests Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Adding VLLM check and logs, currently VLLM not working in CI/CD Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * cleanup: compose.yaml Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * update: arangodb healthcheck Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * cleanup Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * cleanup: retriever test Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * fix: typo Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * rem: unused vars Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * fix: indent Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * temp: swap vllm healthcheck with sleep Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * fix: typo Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * fix: component name typo Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * fix: support `EmbedDoc` for retriever Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * fix: `getattr` Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * fix: CURL command Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * revert 6061484 Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Update xtune file and change DDP paramter (opea-project#1552) Signed-off-by: jilongwa <jilong.wang@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * add N/A option (opea-project#1561) Signed-off-by: ZhangJianyu <zhang.jianyu@outlook.com> Co-authored-by: ZhangJianyu <zhang.jianyu@outlook.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Test latest gaudi docker container (opea-project#1477) Update base gaudi container into the latest version, docker pull vault.habana.ai/gaudi-docker/1.20.1/ubuntu22.04/habanalabs/pytorch-installer-2.6.0:latest, https://docs.habana.ai/en/latest/Installation_Guide/Additional_Installation/Docker_Installation.html#use-intel-gaudi-containers Signed-off-by: chensuyue <suyue.chen@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * fix audioqna male voice setting (opea-project#1559) Co-authored-by: Letong Han <106566639+letonghan@users.noreply.github.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * added error handling for lvm (opea-project#1556) Signed-off-by: okhleif-IL <omar.khleif@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * enable mysql db for sql agent (opea-project#1431) Signed-off-by: cheehook <chee.hoo.kok@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Enlarge DocSum prompt buffer (opea-project#1567) * Enlarge DocSum prompt buffer Follow PR opea-project#1471 Signed-off-by: XinyaoWa <xinyao.wang@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Update vLLM parameter max-seq-len-to-capture (opea-project#1565) Signed-off-by: lvliang-intel <liang1.lv@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * fix: lint Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * fix: missing import Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * new: healtcheck for dataprep-arangodb Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * update: arangodb readmes Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * cleanup: test_dataprep_arango.sh Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * cleanup: test_dataprep_arango.sh (PT2) Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * cleanup: test_dataprep_arango.sh (PT3) Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * update: test_dataprep_arango.sh Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * fix: whitespace Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Remove Transformers versions from requirements.txt file (opea-project#1547) * Remove Transformers versions from requirements.txt file Signed-off-by: Abolfazl Shahbazi <12436063+ashahba@users.noreply.github.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Remove index_names from files for dataprep-get request (opea-project#1569) * remove index_names from files fot get request Signed-off-by: Mustafa <mustafa.cetin@intel.com> * update the tests Signed-off-by: Mustafa <mustafa.cetin@intel.com> * update the tests Signed-off-by: Mustafa <mustafa.cetin@intel.com> * update the tests Signed-off-by: Mustafa <mustafa.cetin@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add validation check for 'all' as an index_name Signed-off-by: Mustafa <mustafa.cetin@intel.com> * fix for readme file Signed-off-by: Mustafa <mustafa.cetin@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mustafa <mustafa.cetin@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Abolfazl Shahbazi <12436063+ashahba@users.noreply.github.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Upgrade Optimum Habana version to fix security check issue (opea-project#1571) Signed-off-by: lvliang-intel <liang1.lv@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Make llamaguard compatible with both TGI and vLLM (opea-project#1581) Signed-off-by: lvliang-intel <liang1.lv@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Fix Dockerfile error and add CI test for IPEX (opea-project#1585) * Fix Dockerfile error and add CI teat Signed-off-by: lvliang-intel <liang1.lv@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Reduce multilang tts docker image size (opea-project#1587) * fix audioqna male voice setting * reduce multilang tts docker image size Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * unset OPENAI_KEY in CI test (opea-project#1586) Signed-off-by: Rita Brugarolas <rita.brugarolas.brufau@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Add AWS Credentials for CD test (opea-project#1588) * Fix CD test issue Signed-off-by: ZePan110 <ze.pan@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * update: shorten ingest_dataprep.txt Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * revert: a4d943e Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * new: `DataprepRequest` model (opea-project#1525) * new: `DataprepRequest` * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix: docstrings * rem: `ingest_from_graphDB` * new: dep injection * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix: verbose `input` processing * attempt: replace `kwargs` with params * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rem: `db_type` ref: opea-project#1525 (comment) * attempt: require `base` * Revert "attempt: require `base`" This reverts commit 620ca6b. * new: `DataprepRequest` * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix: docstrings * rem: `ingest_from_graphDB` * new: dep injection * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix: verbose `input` processing * attempt: replace `kwargs` with params * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rem: `db_type` ref: opea-project#1525 (comment) * attempt: require `base` * Revert "attempt: require `base`" This reverts commit 620ca6b. * Fix dataprep request class issue of Redis (opea-project#1) * new: `DataprepRequest` * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix: docstrings * rem: `ingest_from_graphDB` * new: dep injection * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix: verbose `input` processing * attempt: replace `kwargs` with params * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rem: `db_type` ref: opea-project#1525 (comment) * attempt: require `base` * Revert "attempt: require `base`" This reverts commit 620ca6b. * fix dataprep request class of redis Signed-off-by: letonghan <letong.han@intel.com> * revert change in redis.py Signed-off-by: letonghan <letong.han@intel.com> --------- Signed-off-by: letonghan <letong.han@intel.com> Co-authored-by: Anthony Mahanna <anthony.mahanna@arangodb.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Anthony Mahanna <43019056+aMahanna@users.noreply.github.com> Co-authored-by: Liang Lv <liang1.lv@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert: `DataprepRequest` for multimodal * revert: `DataprepRequest` for multimodal (PT2) * fix: conditionally fetch unique `DataprepRequest` attributes * fix bugs in dataprep util script Signed-off-by: letonghan <letong.han@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert change of pgvector Signed-off-by: letonghan <letong.han@intel.com> * fix indices bug for redis Signed-off-by: letonghan <letong.han@intel.com> * minor fix for redis Signed-off-by: letonghan <letong.han@intel.com> * ingest file into rag_redis_test Signed-off-by: letonghan <letong.han@intel.com> * update indice name Signed-off-by: letonghan <letong.han@intel.com> --------- Signed-off-by: letonghan <letong.han@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Liang Lv <liang1.lv@intel.com> Co-authored-by: Letong Han <106566639+letonghan@users.noreply.github.com> Co-authored-by: letonghan <letong.han@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * revert: bc4445c Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * revert: d17f6aa Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * Revert "new: `DataprepRequest` model (opea-project#1525)" (opea-project#1592) This reverts commit 2cc9559. Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * add hyperlinks Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * revert: 4eb9ec4f Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * new: ArangoDBDataprepRequest Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * fix: lint Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * cleanup: delete_files Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * remove: env mutation Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> * fix: move openai key env var to top of file Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> --------- Signed-off-by: chensuyue <suyue.chen@intel.com> Signed-off-by: Anthony Mahanna <anthony.mahanna@arangodb.com> Signed-off-by: Yao, Qing <qing.yao@intel.com> Signed-off-by: jilongwa <jilong.wang@intel.com> Signed-off-by: ZhangJianyu <zhang.jianyu@outlook.com> Signed-off-by: okhleif-IL <omar.khleif@intel.com> Signed-off-by: cheehook <chee.hoo.kok@intel.com> Signed-off-by: XinyaoWa <xinyao.wang@intel.com> Signed-off-by: lvliang-intel <liang1.lv@intel.com> Signed-off-by: Abolfazl Shahbazi <12436063+ashahba@users.noreply.github.com> Signed-off-by: Mustafa <mustafa.cetin@intel.com> Signed-off-by: Rita Brugarolas <rita.brugarolas.brufau@intel.com> Signed-off-by: ZePan110 <ze.pan@intel.com> Signed-off-by: letonghan <letong.han@intel.com> Co-authored-by: chen, suyue <suyue.chen@intel.com> Co-authored-by: Yao Qing <Qing.Yao@intel.com> Co-authored-by: lasyasn <lasyan640@gmail.com> Co-authored-by: Ajay Kallepalli <ajay.r.kallepalli@gmail.com> Co-authored-by: jilongW <109333127+jilongW@users.noreply.github.com> Co-authored-by: Neo Zhang Jianyu <jianyu.zhang@intel.com> Co-authored-by: ZhangJianyu <zhang.jianyu@outlook.com> Co-authored-by: Spycsh <39623753+Spycsh@users.noreply.github.com> Co-authored-by: Letong Han <106566639+letonghan@users.noreply.github.com> Co-authored-by: Omar Khleif <omar.khleif@intel.com> Co-authored-by: cheehook <chee.hoo.kok@intel.com> Co-authored-by: XinyaoWa <xinyao.wang@intel.com> Co-authored-by: Liang Lv <liang1.lv@intel.com> Co-authored-by: Abolfazl Shahbazi <12436063+ashahba@users.noreply.github.com> Co-authored-by: Mustafa <109312699+MSCetin37@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: rbrugaro <rita.brugarolas.brufau@intel.com> Co-authored-by: ZePan110 <ze.pan@intel.com> Co-authored-by: letonghan <letong.han@intel.com>

Description
Introduces the ability to customize Dataprep input parameters by way of subclassing the
DataprepRequestpydantic model.Avoids having to introduce parameters unique to 1 or 2 Dataprep integrations across all Dataprep providers.
Similar discussion & solution here: #1466 (comment)
Feedback welcomed.
Issues
#1516
Type of change
List the type of change like below. Please delete options that are not relevant.
Dependencies
N/A
Tests
N/A