Implement Feature/tiger-wsibulk #884

jklubienski · 2025-09-10T10:54:04Z

Adds TIGER tumour detection task to eva using the WSIBULK dataset from the TIGER grand challenge (tiger_tumour.py), along with a suggested config to run it (tiger_tumour.yaml)

Docs for all the TIGER tasks found in tiger.md

Testing to come precipitantly.

nkaenzig

Looks very nice, thanks for this contribution. 🚀 I left a couple of comments.

docs/datasets/tiger.md

configs/vision/pathology/offline/classification/tiger_wsibulk.yaml

src/eva/vision/data/datasets/classification/tiger_tumour.py

configs/vision/pathology/offline/classification/tiger_tumour.yaml

src/eva/vision/data/datasets/classification/tiger_tumour.py

nkaenzig · 2025-09-16T07:59:10Z

src/eva/vision/data/datasets/classification/tiger_tumour.py

Related comment from other PR: #885 (comment)

nkaenzig

Great work 🚀 Left a few last small comments. Last thing missing would be the unit tests as discussed offline.

nkaenzig · 2025-09-24T13:24:02Z

src/eva/vision/data/datasets/tiger.py

+        if not all_paths:
+            raise FileNotFoundError(f"No .tif files found in {image_dir}")
+
+        rng = random.Random(self._seed)  # nosec B311


Note that we have a splitting module which provide functions to do random splits:

eva/tests/eva/core/data/splitting/test_random.py

Line 22 in 9ee124d

train_indices, val_indices, test_indices = splitting.random_split(

nkaenzig · 2025-09-24T13:38:28Z

docs/datasets/tiger.md

+    |   │   ├── 104S.tiff
+    │   |   └── ...									
+	|	|__tissue-masks/                            * Not used in eva
+	|	|__tiger-tils-scores-wsitils.csv            * Target variable file


nit: is it tiger-til-scores-wsitils.csv or tiger-tils-scores-wsitils.csv?

nkaenzig · 2025-09-24T13:39:28Z

docs/datasets/tiger.md

+
+
+
+


nit: we can remove those blank lines

nkaenzig · 2025-09-24T13:39:45Z

src/eva/vision/data/datasets/classification/tiger_wsibulk.py

+    def prepare_data(self) -> None:
+        _validators.check_dataset_exists(self._root, False)
+
+    # @override


should we uncomment this? (I assume this was commented used for testing)

nkaenzig · 2025-09-24T13:43:22Z

src/eva/vision/data/wsi/patching/coordinates.py

        return coord_dict

+    @classmethod
+    def from_dict(cls, row: Dict[str, Any]) -> "PatchCoordinates":


This classmethod might not required, I think we probably can also use the main constructor like this:

Either like this PatchCoordinates(**row.as_dict()), or with explicit assignments: PatchCoordinates(x_y=row["x_y"], ...)

nkaenzig · 2025-09-24T13:44:34Z

src/eva/vision/data/datasets/tiger.py

+    def _load_file_paths(self, split: Literal["train", "val", "test"] | None = None) -> List[str]:
+        """Loads the file paths of WSIs from wsibulk/images.
+
+        Splits are assigned 70% train, 15% val, 15% test by filename sorting.


by filename sorting can be removed as it's not the case anymore

nkaenzig · 2025-09-24T13:51:54Z

src/eva/vision/data/datasets/classification/tiger_wsibulk.py

+            df = pd.read_csv(csv_path)
+            n_rows = len(df)
+
+            print(f"Annotating split '{split}' with {n_rows} images...")


Let's remove the print or use logger.info instead.

nkaenzig · 2025-09-24T13:52:22Z

src/eva/vision/data/datasets/classification/__init__.py

    "WsiClassificationDataset",
    "PANDA",
    "PANDASmall",
    "Camelyon16",


not related to this PR but just saw that Camelyon16 appears twice

nkaenzig · 2025-09-24T13:53:14Z

src/eva/vision/data/datasets/tiger.py

+    _val_split_ratio: float = 0.15
+
+    # target microns per pixel (mpp) for patches.
+    _target_mpp: float = 0.5


same as here #885 (comment)

nkaenzig · 2025-09-24T13:55:11Z

configs/vision/pathology/offline/classification/tiger_wsibulk.yaml

Plz add this config to tests/eva/vision/test_vision_cli.py (at least to test_configuration_initialization, ideally also to test_predict_fit_from_configuration), so we can test for instantiation errors

nkaenzig · 2025-09-24T15:25:20Z

configs/vision/pathology/offline/classification/tiger_wsibulk.yaml

Adding some feedback I got from claude / codex here:

• src/eva/vision/data/datasets/tiger.py:108: _load_file_paths only glob matches *.tif. The TIGER docs you just added (docs/datasets/tiger.md, see the sample tree) show the WSIs using the .tiff extension.

• src/eva/vision/data/datasets/classification/tiger_wsibulk.py:86: the dataset now emits progress via print every time annotations are built. In multi-worker loaders this will spam STDOUT and can even deadlock under some multiprocessing backends; please route this through the project logger (or make it optional).

Jklubienski added 2 commits September 10, 2025 08:53

Add capacity to load remote YAML URLs

400dff8

Change _save_config function to use fspec

37d4950

jklubienski force-pushed the feature/tiger-tumour branch 7 times, most recently from 66f3b34 to b753856 Compare September 10, 2025 13:22

Implement TIGER Tumour classification task

41e96c7

jklubienski force-pushed the feature/tiger-tumour branch from b753856 to 41e96c7 Compare September 10, 2025 13:27

nkaenzig reviewed Sep 16, 2025

View reviewed changes

src/eva/vision/data/datasets/classification/tiger_tumour.py Outdated Show resolved Hide resolved

nkaenzig mentioned this pull request Sep 16, 2025

Implement TIGER TIL task #885

Open

nkaenzig reviewed Sep 16, 2025

View reviewed changes

src/eva/vision/data/datasets/classification/tiger_tumour.py Outdated

Copy link

Collaborator

nkaenzig Sep 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related comment from other PR: #885 (comment)

Refactor codebase and allign with queries from code review

a23b673

jklubienski changed the title ~~Feature/tiger tumour~~ Implement Feature/tiger-wsibulk Sep 24, 2025

nkaenzig reviewed Sep 24, 2025

View reviewed changes

Jklubienski added 3 commits October 1, 2025 14:28

Updated wsibulk task based on secondary feedback

b447b0d

Small bugfixes and test support

247e629

Added unit tests

9137aad

Implement Feature/tiger-wsibulk #884

Are you sure you want to change the base?

Implement Feature/tiger-wsibulk #884

Uh oh!

Conversation

jklubienski commented Sep 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nkaenzig left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nkaenzig left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nkaenzig Sep 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jklubienski commented Sep 10, 2025 •

edited

Loading

nkaenzig Sep 24, 2025 •

edited

Loading