Skip to content

Commit e21340d

Browse files
stas00sgugger
andauthored
[testing utils] get_auto_remove_tmp_dir more intuitive behavior (#8401)
* [testing utils] get_auto_remove_tmp_dir default change Now that I have been using `get_auto_remove_tmp_dir default change` for a while, I realized that the defaults aren't most optimal. 99% of the time we want the tmp dir to be empty at the beginning of the test - so changing the default to `before=True` - this shouldn't impact any tests since this feature is used only during debug. * simplify things * update docs * fix doc layout * style * Update src/transformers/testing_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * better 3-state doc * style * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * s/tmp/temporary/ + style * correct the statement Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
1 parent e7e1549 commit e21340d

2 files changed

Lines changed: 83 additions & 48 deletions

File tree

docs/source/testing.rst

Lines changed: 20 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -716,11 +716,11 @@ Temporary files and directories
716716
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
717717

718718
Using unique temporary files and directories are essential for parallel test running, so that the tests won't overwrite
719-
each other's data. Also we want to get the temp files and directories removed at the end of each test that created
719+
each other's data. Also we want to get the temporary files and directories removed at the end of each test that created
720720
them. Therefore, using packages like ``tempfile``, which address these needs is essential.
721721

722-
However, when debugging tests, you need to be able to see what goes into the temp file or directory and you want to
723-
know it's exact path and not having it randomized on every test re-run.
722+
However, when debugging tests, you need to be able to see what goes into the temporary file or directory and you want
723+
to know it's exact path and not having it randomized on every test re-run.
724724

725725
A helper class :obj:`transformers.test_utils.TestCasePlus` is best used for such purposes. It's a sub-class of
726726
:obj:`unittest.TestCase`, so we can easily inherit from it in the test modules.
@@ -736,32 +736,33 @@ Here is an example of its usage:
736736
737737
This code creates a unique temporary directory, and sets :obj:`tmp_dir` to its location.
738738

739-
In this and all the following scenarios the temporary directory will be auto-removed at the end of test, unless
740-
``after=False`` is passed to the helper function.
741-
742-
* Create a temporary directory of my choice and delete it at the end - useful for debugging when you want to monitor a
743-
specific directory:
739+
* Create a unique temporary dir:
744740

745741
.. code-block:: python
746742
747743
def test_whatever(self):
748-
tmp_dir = self.get_auto_remove_tmp_dir(tmp_dir="./tmp/run/test")
744+
tmp_dir = self.get_auto_remove_tmp_dir()
745+
746+
``tmp_dir`` will contain the path to the created temporary dir. It will be automatically removed at the end of the
747+
test.
749748

750-
* Create a temporary directory of my choice and do not delete it at the end---useful for when you want to look at the
751-
temp results:
749+
* Create a temporary dir of my choice, ensure it's empty before the test starts and don't empty it after the test.
752750

753751
.. code-block:: python
754752
755753
def test_whatever(self):
756-
tmp_dir = self.get_auto_remove_tmp_dir(tmp_dir="./tmp/run/test", after=False)
754+
tmp_dir = self.get_auto_remove_tmp_dir("./xxx")
757755
758-
* Create a temporary directory of my choice and ensure to delete it right away---useful for when you disabled deletion
759-
in the previous test run and want to make sure the that temporary directory is empty before the new test is run:
756+
This is useful for debug when you want to monitor a specific directory and want to make sure the previous tests didn't
757+
leave any data in there.
760758

761-
.. code-block:: python
759+
* You can override the default behavior by directly overriding the ``before`` and ``after`` args, leading to one of the
760+
following behaviors:
762761

763-
def test_whatever(self):
764-
tmp_dir = self.get_auto_remove_tmp_dir(tmp_dir="./tmp/run/test", before=True)
762+
- ``before=True``: the temporary dir will always be cleared at the beginning of the test.
763+
- ``before=False``: if the temporary dir already existed, any existing files will remain there.
764+
- ``after=True``: the temporary dir will always be deleted at the end of the test.
765+
- ``after=False``: the temporary dir will always be left intact at the end of the test.
765766

766767
.. note::
767768
In order to run the equivalent of ``rm -r`` safely, only subdirs of the project repository checkout are allowed if
@@ -815,7 +816,7 @@ or the ``xfail`` way:
815816
@pytest.mark.xfail
816817
def test_feature_x():
817818
818-
Here is how to skip a test based on some internal check inside the test:
819+
- Here is how to skip a test based on some internal check inside the test:
819820

820821
.. code-block:: python
821822
@@ -838,7 +839,7 @@ or the ``xfail`` way:
838839
def test_feature_x():
839840
pytest.xfail("expected to fail until bug XYZ is fixed")
840841
841-
Here is how to skip all tests in a module if some import is missing:
842+
- Here is how to skip all tests in a module if some import is missing:
842843

843844
.. code-block:: python
844845

src/transformers/testing_utils.py

Lines changed: 63 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -522,45 +522,47 @@ class solves this problem by sorting out all the basic paths and provides easy a
522522
- ``repo_root_dir_str``
523523
- ``src_dir_str``
524524
525-
Feature 2: Flexible auto-removable temp dirs which are guaranteed to get removed at the end of test.
525+
Feature 2: Flexible auto-removable temporary dirs which are guaranteed to get removed at the end of test.
526526
527-
In all the following scenarios the temp dir will be auto-removed at the end of test, unless `after=False`.
528-
529-
# 1. create a unique temp dir, `tmp_dir` will contain the path to the created temp dir
527+
1. Create a unique temporary dir:
530528
531529
::
532530
533531
def test_whatever(self):
534532
tmp_dir = self.get_auto_remove_tmp_dir()
535533
536-
# 2. create a temp dir of my choice and delete it at the end - useful for debug when you want to # monitor a
537-
specific directory
534+
``tmp_dir`` will contain the path to the created temporary dir. It will be automatically removed at the end of the
535+
test.
536+
537+
538+
2. Create a temporary dir of my choice, ensure it's empty before the test starts and don't
539+
empty it after the test.
538540
539541
::
540542
541543
def test_whatever(self):
542-
tmp_dir = self.get_auto_remove_tmp_dir(tmp_dir="./tmp/run/test")
544+
tmp_dir = self.get_auto_remove_tmp_dir("./xxx")
543545
544-
# 3. create a temp dir of my choice and do not delete it at the end - useful for when you want # to look at the
545-
temp results
546+
This is useful for debug when you want to monitor a specific directory and want to make sure the previous tests
547+
didn't leave any data in there.
546548
547-
::
548-
def test_whatever(self):
549-
tmp_dir = self.get_auto_remove_tmp_dir(tmp_dir="./tmp/run/test", after=False)
549+
3. You can override the first two options by directly overriding the ``before`` and ``after`` args, leading to the
550+
following behavior:
550551
551-
# 4. create a temp dir of my choice and ensure to delete it right away - useful for when you # disabled deletion in
552-
the previous test run and want to make sure the that tmp dir is empty # before the new test is run
552+
``before=True``: the temporary dir will always be cleared at the beginning of the test.
553553
554-
::
554+
``before=False``: if the temporary dir already existed, any existing files will remain there.
555555
556-
def test_whatever(self):
557-
tmp_dir = self.get_auto_remove_tmp_dir(tmp_dir="./tmp/run/test", before=True)
556+
``after=True``: the temporary dir will always be deleted at the end of the test.
557+
558+
``after=False``: the temporary dir will always be left intact at the end of the test.
558559
559-
Note 1: In order to run the equivalent of `rm -r` safely, only subdirs of the project repository checkout are
560-
allowed if an explicit `tmp_dir` is used, so that by mistake no `/tmp` or similar important part of the filesystem
561-
will get nuked. i.e. please always pass paths that start with `./`
560+
Note 1: In order to run the equivalent of ``rm -r`` safely, only subdirs of the project repository checkout are
561+
allowed if an explicit ``tmp_dir`` is used, so that by mistake no ``/tmp`` or similar important part of the
562+
filesystem will get nuked. i.e. please always pass paths that start with ``./``
562563
563-
Note 2: Each test can register multiple temp dirs and they all will get auto-removed, unless requested otherwise.
564+
Note 2: Each test can register multiple temporary dirs and they all will get auto-removed, unless requested
565+
otherwise.
564566
565567
Feature 3: Get a copy of the ``os.environ`` object that sets up ``PYTHONPATH`` specific to the current test suite.
566568
This is useful for invoking external programs from the test suite - e.g. distributed training.
@@ -573,6 +575,7 @@ def test_whatever(self):
573575
"""
574576

575577
def setUp(self):
578+
# get_auto_remove_tmp_dir feature:
576579
self.teardown_tmp_dirs = []
577580

578581
# figure out the resolved paths for repo_root, tests, examples, etc.
@@ -660,21 +663,42 @@ def get_env(self):
660663
env["PYTHONPATH"] = ":".join(paths)
661664
return env
662665

663-
def get_auto_remove_tmp_dir(self, tmp_dir=None, after=True, before=False):
666+
def get_auto_remove_tmp_dir(self, tmp_dir=None, before=None, after=None):
664667
"""
665668
Args:
666669
tmp_dir (:obj:`string`, `optional`):
667-
use this path, if None a unique path will be assigned
668-
before (:obj:`bool`, `optional`, defaults to :obj:`False`):
669-
if `True` and tmp dir already exists make sure to empty it right away
670-
after (:obj:`bool`, `optional`, defaults to :obj:`True`):
671-
delete the tmp dir at the end of the test
670+
if :obj:`None`:
671+
672+
- a unique temporary path will be created
673+
- sets ``before=True`` if ``before`` is :obj:`None`
674+
- sets ``after=True`` if ``after`` is :obj:`None`
675+
else:
676+
677+
- :obj:`tmp_dir` will be created
678+
- sets ``before=True`` if ``before`` is :obj:`None`
679+
- sets ``after=False`` if ``after`` is :obj:`None`
680+
before (:obj:`bool`, `optional`):
681+
If :obj:`True` and the :obj:`tmp_dir` already exists, make sure to empty it right away if :obj:`False`
682+
and the :obj:`tmp_dir` already exists, any existing files will remain there.
683+
after (:obj:`bool`, `optional`):
684+
If :obj:`True`, delete the :obj:`tmp_dir` at the end of the test if :obj:`False`, leave the
685+
:obj:`tmp_dir` and its contents intact at the end of the test.
672686
673687
Returns:
674-
tmp_dir(:obj:`string`): either the same value as passed via `tmp_dir` or the path to the auto-created tmp
688+
tmp_dir(:obj:`string`): either the same value as passed via `tmp_dir` or the path to the auto-selected tmp
675689
dir
676690
"""
677691
if tmp_dir is not None:
692+
693+
# defining the most likely desired behavior for when a custom path is provided.
694+
# this most likely indicates the debug mode where we want an easily locatable dir that:
695+
# 1. gets cleared out before the test (if it already exists)
696+
# 2. is left intact after the test
697+
if before is None:
698+
before = True
699+
if after is None:
700+
after = False
701+
678702
# using provided path
679703
path = Path(tmp_dir).resolve()
680704

@@ -691,6 +715,15 @@ def get_auto_remove_tmp_dir(self, tmp_dir=None, after=True, before=False):
691715
path.mkdir(parents=True, exist_ok=True)
692716

693717
else:
718+
# defining the most likely desired behavior for when a unique tmp path is auto generated
719+
# (not a debug mode), here we require a unique tmp dir that:
720+
# 1. is empty before the test (it will be empty in this situation anyway)
721+
# 2. gets fully removed after the test
722+
if before is None:
723+
before = True
724+
if after is None:
725+
after = True
726+
694727
# using unique tmp dir (always empty, regardless of `before`)
695728
tmp_dir = tempfile.mkdtemp()
696729

@@ -701,7 +734,8 @@ def get_auto_remove_tmp_dir(self, tmp_dir=None, after=True, before=False):
701734
return tmp_dir
702735

703736
def tearDown(self):
704-
# remove registered temp dirs
737+
738+
# get_auto_remove_tmp_dir feature: remove registered temp dirs
705739
for path in self.teardown_tmp_dirs:
706740
shutil.rmtree(path, ignore_errors=True)
707741
self.teardown_tmp_dirs = []

0 commit comments

Comments
 (0)