Skip to content

Conversation

@sd3ntato
Copy link

@sd3ntato sd3ntato commented May 14, 2024

togheter with some fixed to the dependency hell, I added a docker compose config.

sd3ntato added 3 commits May 14, 2024 12:26
after the previous commit, i did
```
python3.8 -m venv .venv38
source .venv38/bin/activate
pip install .
excalibur initdb
excalibur webserver
```

and got the app running, but then when I uploaded a pdf I got this error:
```
ERROR:root:PdfFileReader is deprecated and was removed in PyPDF2 3.0.0. Use PdfReader instead.
Traceback (most recent call last):
  File "/Users/valeriomariani/Desktop/test_excalibur/excalibur/.venv38/lib/python3.8/site-packages/excalibur/tasks.py", line 22, in split
    extract_pages, total_pages = get_pages(file.filepath, file.pages)
  File "/Users/valeriomariani/Desktop/test_excalibur/excalibur/.venv38/lib/python3.8/site-packages/excalibur/utils/task.py", line 29, in get_pages
    infile = PdfFileReader(inputstream, strict=False)
  File "/Users/valeriomariani/Desktop/test_excalibur/excalibur/.venv38/lib/python3.8/site-packages/PyPDF2/_reader.py", line 1974, in __init__
    deprecation_with_replacement("PdfFileReader", "PdfReader", "3.0.0")
  File "/Users/valeriomariani/Desktop/test_excalibur/excalibur/.venv38/lib/python3.8/site-packages/PyPDF2/_utils.py", line 369, in deprecation_with_replacement
    deprecation(DEPR_MSG_HAPPENED.format(old_name, removed_in, new_name))
  File "/Users/valeriomariani/Desktop/test_excalibur/excalibur/.venv38/lib/python3.8/site-packages/PyPDF2/_utils.py", line 351, in deprecation
    raise DeprecationError(msg)
PyPDF2.errors.DeprecationError: PdfFileReader is deprecated and was removed in PyPDF2 3.0.0. Use PdfReader instead.
```

so i fixed PyPDF2==2.0 and the app manages to read pdfs and detect tables, but when i try to "view and download data" i get this:

```
ERROR:root:to_excel() got an unexpected keyword argument 'encoding'
Traceback (most recent call last):
  File "/Users/valeriomariani/Desktop/test_excalibur/excalibur/.venv38/lib/python3.8/site-packages/excalibur/tasks.py", line 126, in extract
    tables.export(f_datapath, f=f, compress=True)
  File "/Users/valeriomariani/Desktop/test_excalibur/excalibur/.venv38/lib/python3.8/site-packages/camelot/core.py", line 736, in export
    table.df.to_excel(writer, sheet_name=sheet_name, encoding="utf-8")
TypeError: to_excel() got an unexpected keyword argument 'encoding'
```
you can now docker ```compose up --build ``` and access http://localhost:5001
@kuirolo
Copy link

kuirolo commented Jun 25, 2024

Combining this with #169 gets the local server working for me (mostly). I didn't test changes to docker though.

I noticed that camelot now uses pypdf instead of PyPDF2. Seems like the next dependency hell is getting both modules on the same pdf stack.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants