Skip to content

chore: enforce type hints across attacks, config, and safemodel modules#422

Open
shamykyzer wants to merge 20 commits intomainfrom
chore/enforce-type-hints
Open

chore: enforce type hints across attacks, config, and safemodel modules#422
shamykyzer wants to merge 20 commits intomainfrom
chore/enforce-type-hints

Conversation

@shamykyzer
Copy link
Copy Markdown
Contributor

@shamykyzer shamykyzer commented Mar 10, 2026

Linting & CI:

  • Enabled the Ruff ANN rule set to require type hints on all function signatures
  • Disabled ANN401 to permit Any in cases where interfaces are intentionally dynamic
  • Added a mypy (v1.19.1) pre-commit hook for static type analysis

Annotations:

  • Added type annotations to function signatures across 17 files in attacks/, config/, and safemodel/
  • Refined local variable annotations — kept where they add clarity, removed where redundant

Cleanup:

  • Removed unused imports (PyTorchDataHandler, SklearnDataHandler) from target.py

Closes #415

@shamykyzer shamykyzer linked an issue Mar 10, 2026 that may be closed by this pull request
@shamykyzer shamykyzer self-assigned this Mar 10, 2026
@shamykyzer shamykyzer changed the title Chore/enforce type hints chore: enforce type hints Mar 10, 2026
@shamykyzer shamykyzer force-pushed the chore/enforce-type-hints branch from 1cb8014 to ec98cae Compare March 10, 2026 19:57
@shamykyzer shamykyzer marked this pull request as ready for review March 10, 2026 20:09
@shamykyzer shamykyzer requested a review from rpreen March 10, 2026 20:48
@rpreen
Copy link
Copy Markdown
Contributor

rpreen commented Mar 10, 2026

Looks like a good start - anything and everything we can lock down the better - adding them to the variable definitions inside functions too would be good because it's such an easy way to catch any unexpected behaviour.

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 11, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.51%. Comparing base (a9524af) to head (107401b).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #422      +/-   ##
==========================================
- Coverage   99.51%   99.51%   -0.01%     
==========================================
  Files          23       23              
  Lines        2687     2686       -1     
==========================================
- Hits         2674     2673       -1     
  Misses         13       13              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@shamykyzer shamykyzer changed the title chore: enforce type hints chore: enforce type hints across attacks, config, and safemodel modules Mar 15, 2026
@shamykyzer
Copy link
Copy Markdown
Contributor Author

hello @rpreen, can you have a look at this and let me know if there is anything you want me to add or change?

@rpreen
Copy link
Copy Markdown
Contributor

rpreen commented Mar 16, 2026

This also needs a slight change to report.py to deal with the conflict from the PR just merged.

@shamykyzer
Copy link
Copy Markdown
Contributor Author

shamykyzer commented Mar 19, 2026

Hi @rpreen , I was going through the type hints and ran into a couple of things worth noting:

  • Renamed indices_train to train_set in likelihood_attack.py:248 — mypy flagged it as a redefinition because indices_train is first returned as np.ndarray from get_shadow_model on line 239, then converted to set[int] on line 248. Renaming avoids reusing the same name with two different types.
  • Wrapped min_child_weight in structural_attack.py:206 to fit the 88-char line limit in ruff.

Also found what I think is a pre-existing bug: get_shadow_model in utils.py:148 loads indices_train.pkl twice, the second one should be indices_test.pkl.

It works correctly when saving but not during loading, was this intentional?

@shamykyzer shamykyzer requested a review from rpreen March 19, 2026 03:50
@rpreen
Copy link
Copy Markdown
Contributor

rpreen commented Mar 20, 2026

Also found what I think is a pre-existing bug: get_shadow_model in utils.py:148 loads indices_train.pkl twice, the second one should be indices_test.pkl.

Good find - we need a bug fix PR for this as high priority.

@rpreen
Copy link
Copy Markdown
Contributor

rpreen commented Mar 20, 2026

Also found what I think is a pre-existing bug: get_shadow_model in utils.py:148 loads indices_train.pkl twice, the second one should be indices_test.pkl.

Good find - we need a bug fix PR for this as high priority.

Thankfully it looks like this doesn't actually effect anything because the test indices aren't used when computing the LiRA scores (the check is simply if not in train indices) - but still important to fix in case they get used in future code.

@jim-smith
Copy link
Copy Markdown
Contributor

@shamykyzer nice find!. this is why collaborative s/w is best

@shamykyzer
Copy link
Copy Markdown
Contributor Author

Hello @rpreen, can you review this PR when you get a chance?

  • Type hints are at 99% coverage with 0 missing annotation violations from ruff and 16 remaining are ANN401 in 4 files for dynamic interfaces.

Should I expand the mypy files list beyond the current 8 files and are the 16 Any usages fine or should I narrow them?

Also, current ruff configuration only flags missing hints, wouldn't it be worth adding monkeytype or autotyping to auto-generate them going forward?

Or is it good as it is? Thanks.

@shamykyzer shamykyzer requested a review from jim-smith March 27, 2026 04:31
@rpreen
Copy link
Copy Markdown
Contributor

rpreen commented Mar 30, 2026

This looks good as it is I think - I don't think we need to have everything perfectly typed, just adding as much as we can sensibly helps. I don't know enough about auto-generating types to comment on whether that would be useful to add - but I would think that is not something that would be added to Ruff/pre-commit, but would be done in a special PR like this (or a future one) and carefully reviewed. I suspect that since you have 99% done here that an auto-generator would likely struggle with the remaining complex cases and possibly just put Any for the areas we currently do need dynamic types anyway.

My only last question here: is there a reason for removing the existing types from sacroml/config/target.py and sacroml/config/attack.py?

We can expand more files later.

@shamykyzer
Copy link
Copy Markdown
Contributor Author

This looks good as it is I think - I don't think we need to have everything perfectly typed, just adding as much as we can sensibly helps. I don't know enough about auto-generating types to comment on whether that would be useful to add - but I would think that is not something that would be added to Ruff/pre-commit, but would be done in a special PR like this (or a future one) and carefully reviewed. I suspect that since you have 99% done here that an auto-generator would likely struggle with the remaining complex cases and possibly just put Any for the areas we currently do need dynamic types anyway.

My only last question here: is there a reason for removing the existing types from sacroml/config/target.py and sacroml/config/attack.py?

We can expand more files later.

Well, I thought since every function now has type annotations on its parameters and return values, mypy will infer the local variable types from the function call so annotating the variable too would've been redundant.

  def _get_defaults(name: str) -> dict:                                                                                                                    
      ...                                                                                                                                                  
   
  params: dict = _get_defaults(name)  # redundant — mypy already knows it's a dict                                                                         
  params = _get_defaults(name)        # mypy infers dict from the return type

Here _get_defaults is annotated with -> dict, so when mypy sees params = _get_defaults(name) it already knows params is a dict from the return type and adding : dict on the variable just states the same thing twice.

Same with name: str = prompt(...)
the prompt() returns str, so mypy will already know name is a str without the explicit annotation.

This applies to all the removed annotations in both files, happy to restore them if you prefer?

Also I agree on expanding later, I think the safemodel/classifiers would be a good candidate for a future issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Chore] Enforce type hints

4 participants