Conversation
|
Sine works -- let's finalize/merge/release. Ideally: could you add a test which would operate in multiple threads over some limited set of files , so we would double test that all works good? (or it would need also fsspec/filesystem_spec#1111 to work?) |
Codecov ReportBase: 70.69% // Head: 71.06% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## master #83 +/- ##
==========================================
+ Coverage 70.69% 71.06% +0.36%
==========================================
Files 11 11
Lines 761 788 +27
==========================================
+ Hits 538 560 +22
- Misses 223 228 +5
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
|
@yarikoptic Test added and passes in at least one run. |
and seems to fail on another :-/ but seems reminiscent of that issue addressed by atomic writing at fsspec level=================================== FAILURES ===================================
251
_____________________________ test_parallel_access _____________________________
252
253
tmp_path = PosixPath('/tmp/pytest-of-runner/pytest-0/test_parallel_access0')
254
big_url_dataset = (Dataset('/tmp/pytest-of-runner/pytest-0/big_url_dataset0/ds'), {'APL.pdf': 'c65ccc2a97cdb6042641847112dc6e4d4d6e75fde...ff9197', 'libpython3.10-stdlib_3.10.4-3_i386.deb': 'e79c1416ec792b61ad9770f855bf6889e57be5f6511ea814d81ef5f9b1a3eec9'})
255
256
def test_parallel_access(tmp_path, big_url_dataset):
257
ds, data_files = big_url_dataset
258
with fusing(ds.path, tmp_path) as mount:
259
with ThreadPoolExecutor() as pool:
260
futures = {
261
pool.submit(sha256_file, mount / path): dgst
262
for path, dgst in data_files.items()
263
}
264
for fut in as_completed(futures.keys()):
265
> assert fut.result() == futures[fut]
266
267
datalad_fuse/tests/test_fuse.py:247:
268
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
269
/opt/hostedtoolcache/Python/3.6.15/x64/lib/python3.6/concurrent/futures/_base.py:425: in result
270
return self.__get_result()
271
/opt/hostedtoolcache/Python/3.6.15/x64/lib/python3.6/concurrent/futures/_base.py:384: in __get_result
272
raise self._exception
273
/opt/hostedtoolcache/Python/3.6.15/x64/lib/python3.6/concurrent/futures/thread.py:56: in run
274
result = self.fn(*self.args, **self.kwargs)
275
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
276
277
path = PosixPath('/tmp/pytest-of-runner/pytest-0/test_parallel_access0/gameboy.pdf')
278
279
def sha256_file(path):
280
dgst = hashlib.sha256()
281
> with open(path, "rb") as fp:
282
E OSError: [Errno 34] Numerical result out of range: '/tmp/pytest-of-runner/pytest-0/test_parallel_access0/gameboy.pdf'
283
284
datalad_fuse/tests/test_fuse.py:252: OSError
285
---------------------------- Captured stderr setup -----------------------------
286
[INFO] Creating a new annex repo at /tmp/pytest-of-runner/pytest-0/big_url_dataset0/ds
287
------------------------------ Captured log setup ------------------------------
288
INFO datalad.core.local.create:create.py:494 Creating a new annex repo at /tmp/pytest-of-runner/pytest-0/big_url_dataset0/ds
289
----------------------------- Captured stdout call -----------------------------
290
fusefs(ok): /tmp/pytest-of-runner/pytest-0/test_parallel_access0
291
----------------------------- Captured stderr call -----------------------------
292
fuse: bad error value: 1
293
/home/runner/work/datalad-fuse/datalad-fuse/.tox/py3/lib/python3.6/site-packages/requests/__init__.py:104: RequestsDependencyWarning: urllib3 (1.26.12) or chardet (5.0.0)/charset_normalizer (2.0.12) doesn't match a supported version!
294
RequestsDependencyWarning)
295
Traceback (most recent call last):
296
File "/home/runner/work/datalad-fuse/datalad-fuse/.tox/py3/lib/python3.6/site-packages/fuse.py", line 734, in _wrapper
297
return func(*args, **kwargs) or 0
298
File "/home/runner/work/datalad-fuse/datalad-fuse/.tox/py3/lib/python3.6/site-packages/fuse.py", line 835, in open
299
fi.flags)
300
File "/home/runner/work/datalad-fuse/datalad-fuse/.tox/py3/lib/python3.6/site-packages/datalad_fuse/fuse_.py", line 75, in __call__
301
return super(DataLadFUSE, self).__call__(op, self.root + path, *args)
302
File "/home/runner/work/datalad-fuse/datalad-fuse/.tox/py3/lib/python3.6/site-packages/fuse.py", line 1076, in __call__
303
return getattr(self, op)(*args)
304
File "/home/runner/work/datalad-fuse/datalad-fuse/.tox/py3/lib/python3.6/site-packages/datalad_fuse/fuse_.py", line 206, in open
305
fsspec_file = self._adapter.open(path)
306
File "/home/runner/work/datalad-fuse/datalad-fuse/.tox/py3/lib/python3.6/site-packages/datalad_fuse/fsspec.py", line 243, in open
307
return dsap.open(relpath, mode=mode, encoding=encoding, errors=errors)
308
File "/home/runner/work/datalad-fuse/datalad-fuse/.tox/py3/lib/python3.6/site-packages/datalad_fuse/fsspec.py", line 183, in open
309
raise IOError(f"Could not find a usable URL for {relpath}")
310
OSError: Could not find a usable URL for gameboy.pdf
311
312
During handling of the above exception, another exception occurred:
313
314
Traceback (most recent call last):
315
File "/home/runner/work/datalad-fuse/datalad-fuse/.tox/py3/lib/python3.6/site-packages/fuse.py", line 737, in _wrapper
316
if e.errno > 0:
317
TypeError: '>' not supported between instances of 'NoneType' and 'int'
318
319
During handling of the above exception, another exception occurred:
320
321
Traceback (most recent call last):
322
File "_ctypes/callbacks.c", line 234, in 'calling callback function'
323
File "/home/runner/work/datalad-fuse/datalad-fuse/.tox/py3/lib/python3.6/site-packages/fuse.py", line 756, in _wrapper
324
self.__critical_exception = e
325
NameError: name 'self' is not defined
326
Warning: Destroying fsspecs and collection of 0 fhs |
|
@yarikoptic The root error seems to be that one of the file URLs temporarily failed, and that triggered a bug in fusepy. It has nothing to do with fsspec's atomicity. |
| # bother???! | ||
| lgr.debug("File not already open") | ||
| fsspec_file = self._adapter.open(path) | ||
| with self.rwlock: |
There was a problem hiding this comment.
can we move locking down into _adapter and make it repository specific since we have caches per repository?! then outside code would be able to parallelize across repositories.
There was a problem hiding this comment.
@yarikoptic No, as we also need to lock whenever using a pseudo-filehandle returned from _adapter.
There was a problem hiding this comment.
then we might need a helper to return the appropriate (repository specific) lock for a given path to be used.
I've reran the failing test ... it passed, let's proceed. Filed dedicated #87 for a possible improvement |
No description provided.