Skip to content

No success building custom multilang Presidio Docker image #1661

@bfisseler

Description

@bfisseler

Describe the bug
I tried to modify presidio-analyzer Docker image to use Presidio with different languages, e.g. German.

To Reproduce
Steps to reproduce the behavior:

  1. Clone the repo
  2. Add additional language codes presidio_analyzer/conf/default.yaml, e.g.
  -
    lang_code: de
    model_name: de_core_news_md
  1. Add supported languages to default_analyzer.yaml as well as default_recognizers.yaml, e.g.
     - en
     - de
  1. build presidio-analyzer Docker image: docker build -t multilang_presidio .
  2. Build process breaks after ~20 minutes with the following error:
...
27.60   - Installing shellingham (1.5.4)
1492.1 
1493.4 PEP517 build of a dependency failed
1493.4 
1493.4 Backend subprocess exited when trying to invoke build_wheel
...[followed by many more lines of Python gibberish]
1496.9 Note: This error originates from the build backend, and is likely not a problem with poetry but one of the following issues with numpy (2.0.2)
1496.9 
1496.9   - not supporting PEP 517 builds
1496.9   - not specifying PEP 517 build requirements correctly
1496.9   - the build requirements are incompatible with your operating system or Python version
1496.9   - the build requirements are missing system dependencies (eg: compilers, libraries, headers).

Do I need a specific version of numpy? And if so, how do I specify this in the dockerfile?

Expected behavior
I had expected a new customized Docker image.

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
I tried to build presidio-analzyer on macOS running Sequoia.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions