Skip to content

Commit e6bf8e0

Browse files
Allow latest pyarrow version
1 parent 5ba1497 commit e6bf8e0

File tree

2 files changed

+4
-3
lines changed

2 files changed

+4
-3
lines changed

.circleci/config.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ jobs:
1515
- run: source venv/bin/activate
1616
- run: pip install .[tests]
1717
- run: pip install -r additional-tests-requirements.txt --no-deps
18-
- run: pip install pyarrow==3.0.0
18+
- run: pip install pyarrow --upgrade
1919
- run: HF_SCRIPTS_VERSION=master python -m pytest -sv ./tests/
2020

2121
run_dataset_script_tests_pyarrow_1:
@@ -46,7 +46,7 @@ jobs:
4646
- run: "& venv/Scripts/activate.ps1"
4747
- run: pip install .[tests]
4848
- run: pip install -r additional-tests-requirements.txt --no-deps
49-
- run: pip install pyarrow==3.0.0
49+
- run: pip install pyarrow --upgrade
5050
- run: $env:HF_SCRIPTS_VERSION="master"
5151
- run: python -m pytest -sv ./tests/
5252

setup.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -74,7 +74,8 @@
7474
"numpy>=1.17",
7575
# Backend and serialization.
7676
# Minimum 1.0.0 to avoid permission errors on windows when using the compute layer on memory mapped data
77-
"pyarrow>=1.0.0,<4.0.0",
77+
# pyarrow 4.0.0 introduced segfault bug, see: https://github.com/huggingface/datasets/pull/2268
78+
"pyarrow>=1.0.0,!=4.0.0",
7879
# For smart caching dataset processing
7980
"dill",
8081
# For performance gains with apache arrow

0 commit comments

Comments
 (0)