Skip to content

Fix embedding issue with ArangoDB due to deprecated HuggingFace API#1694

Merged
lvliang-intel merged 4 commits intoopea-project:mainfrom
lvliang-intel:fix_dataprep_arango
May 14, 2025
Merged

Fix embedding issue with ArangoDB due to deprecated HuggingFace API#1694
lvliang-intel merged 4 commits intoopea-project:mainfrom
lvliang-intel:fix_dataprep_arango

Conversation

@lvliang-intel
Copy link
Copy Markdown
Collaborator

Description

Fix embedding issue with ArangoDB due to deprecated HuggingFace API

/home/user/comps/dataprep/src/integrations/arangodb.py:203: LangChainDeprecationWarning: The class HuggingFaceHubEmbeddings was deprecated in LangChain 0.2.2 and will be removed in 1.0. An updated version of the class exists in the :class:~langchain-huggingface package and should be used instead. To use it run pip install -U :class:~langchain-huggingface and import as from :class:~langchain_huggingface import HuggingFaceEndpointEmbeddings``.
self.embeddings = HuggingFaceHubEmbeddings(
[2025-05-13 15:04:59,871] [ INFO] - Base service - CORS is enabled.
[2025-05-13 15:04:59,872] [ INFO] - Base service - Setting up HTTP server
[2025-05-13 15:04:59,873] [ INFO] - Base service - Uvicorn server setup on port 5000
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:5000/ (Press CTRL+C to quit)
[2025-05-13 15:04:59,883] [ INFO] - Base service - HTTP server setup successful
[2025-05-13 15:04:59,885] [ INFO] - opea_dataprep_microservice - OPEA Dataprep Microservice is starting...
[2025-05-13 15:05:40,686] [ INFO] - opea_dataprep_microservice - [ ingest ] Base mode
[2025-05-13 15:21:10,160] [ ERROR] - opea_dataprep_microservice - Error during dataprep ingest invocation: 500: Failed to ingest ./uploaded_files/ingest_dataprep.txt into ArangoDB: 'InferenceClient' object has no attribute 'post'

  • exit 1
    Error: Process completed with exit code 1.

Issues

n/a

Type of change

List the type of change like below. Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds new functionality)
  • Breaking change (fix or feature that would break existing design and interface)
  • Others (enhancement, documentation, validation, etc.)

Dependencies

None

Tests

Local test and CI test

@lvliang-intel lvliang-intel merged commit d5db882 into opea-project:main May 14, 2025
38 checks passed
alexsin368 pushed a commit to alexsin368/GenAIComps that referenced this pull request May 15, 2025
yinghu5 added a commit that referenced this pull request May 16, 2025
* add support for remote server

Signed-off-by: alexsin368 <[email protected]>

* add steps to enable remote server

Signed-off-by: alexsin368 <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove use_remote_service

Signed-off-by: alexsin368 <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add OpenAI models instructions, fix format of commands

Signed-off-by: alexsin368 <[email protected]>

* simplify ChatOpenAI instantiation

Signed-off-by: alexsin368 <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Revert "simplify ChatOpenAI instantiation"

This reverts commit b7c4acf.

* add back check and logic for llm_engine, set openai_key argument

Signed-off-by: alexsin368 <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Provide ARCH option for lvm-video-llama image build (#1630)

Signed-off-by: ZePan110 <[email protected]>
Signed-off-by: alexsin368 <[email protected]>

* Add sglang microservice for supporting llama4 model (#1640)

Signed-off-by: Ye, Xinyu <[email protected]>
Co-authored-by: Lv,Liang1 <[email protected]>
Signed-off-by: alexsin368 <[email protected]>

* Remove invalid codeowner. (#1642)

Signed-off-by: ZePan110 <[email protected]>
Signed-off-by: alexsin368 <[email protected]>

* add support for remote server

Signed-off-by: alexsin368 <[email protected]>

* add steps to enable remote server

Signed-off-by: alexsin368 <[email protected]>

* remove use_remote_service

Signed-off-by: alexsin368 <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: alexsin368 <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: alexsin368 <[email protected]>

* bug fix for chunk_size and overlap cause error in dataprep ingestion (#1643)

* bug fix for dataingest url

Signed-off-by: Mustafa <[email protected]>

* add validation function

Signed-off-by: Mustafa <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* validation update

Signed-off-by: Mustafa <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update validation function

Signed-off-by: Mustafa <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Mustafa <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: alexsin368 <[email protected]>

* MariaDB Vector integrations for retriever & dataprep services (#1645)

* Add MariaDB Vector third-party service

MariaDB Vector was introduced since MariaDB Server 11.7

Signed-off-by: Razvan-Liviu Varzaru <[email protected]>

* Add retriever MariaDB Vector integration

Signed-off-by: Razvan-Liviu Varzaru <[email protected]>

* Add dataprep MariaDB Vector integration

Signed-off-by: Razvan-Liviu Varzaru <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix CI failures

- md5 is used for the primary key not as a security hash
- fixed mariadb readme headers

Signed-off-by: Razvan-Liviu Varzaru <[email protected]>

---------

Signed-off-by: Razvan-Liviu Varzaru <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: alexsin368 <[email protected]>

* update PR reviewers (#1651)

Signed-off-by: chensuyue <[email protected]>
Signed-off-by: alexsin368 <[email protected]>

* Expand test matrix, find all tests use 3rd party Dockerfiles (#1676)

* Expand test matrix, find all tests use 3rd party Dockerfiles

Signed-off-by: chensuyue <[email protected]>
Signed-off-by: alexsin368 <[email protected]>

* fix the typo of README.md Comp (#1679)

Update README.md for first entry of OPEA

Signed-off-by: alexsin368 <[email protected]>

* Fix request handle timeout issue (#1687)

Signed-off-by: lvliang-intel <[email protected]>
Signed-off-by: alexsin368 <[email protected]>

* FEAT: Enable OPEA microservices to start as MCP servers (#1635)

Signed-off-by: alexsin368 <[email protected]>

* Fix huggingface_hub API upgrade issue (#1691)

* Fix huggingfacehub API upgrade issue

Signed-off-by: lvliang-intel <[email protected]>
Signed-off-by: alexsin368 <[email protected]>

* add OpenAI models instructions, fix format of commands

Signed-off-by: alexsin368 <[email protected]>

* Fix dataprep opensearch ingest issue (#1697)

Signed-off-by: lvliang-intel <[email protected]>
Signed-off-by: alexsin368 <[email protected]>

* Fix embedding issue with ArangoDB due to deprecated HuggingFace API (#1694)

Signed-off-by: lvliang-intel <[email protected]>
Signed-off-by: alexsin368 <[email protected]>

* simplify ChatOpenAI instantiation

Signed-off-by: alexsin368 <[email protected]>

* Revert "simplify ChatOpenAI instantiation"

This reverts commit b7c4acf.

Signed-off-by: alexsin368 <[email protected]>

* add back check and logic for llm_engine, set openai_key argument

Signed-off-by: alexsin368 <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: alexsin368 <[email protected]>
Signed-off-by: ZePan110 <[email protected]>
Signed-off-by: Ye, Xinyu <[email protected]>
Signed-off-by: Mustafa <[email protected]>
Signed-off-by: Razvan-Liviu Varzaru <[email protected]>
Signed-off-by: chensuyue <[email protected]>
Signed-off-by: lvliang-intel <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Ying Hu <[email protected]>
Co-authored-by: ZePan110 <[email protected]>
Co-authored-by: Liang Lv <[email protected]>
Co-authored-by: Mustafa <[email protected]>
Co-authored-by: Razvan Liviu Varzaru <[email protected]>
Co-authored-by: chen, suyue <[email protected]>
Co-authored-by: Spycsh <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants