Skip to content

spaCy 1.7 model downloads consume large amounts of memory #918

@schmod

Description

@schmod

When downloading large models such as en_core_web_md, spaCy's built-in downloader appears to consume very large quantities of memory (often greatly exceeding the size of the model itself). On less-powerful systems, this can result in a failure if the process consumes all available memory.

This issue can be replicated by running python -m spacy download en_core_web_md, and inspecting memory usage.

This problem does not exist when installing models via pip or a direct download (as those processes presumably stream the downloaded file directly to disk.

Your Environment

  • Operating System: Debian Jessie
  • Python Version Used: 2.7
  • spaCy Version Used: 1.7.2
  • Environment Information: Docker

Metadata

Metadata

Assignees

No one assigned

    Labels

    installInstallation issuesmodelsIssues related to the statistical models

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions