-
Notifications
You must be signed in to change notification settings - Fork 7
Feat: support OOD dual label #373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Feat: support OOD dual label #373
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refactors the model instantiation workflow to support domain-specific model heads (molecules vs. materials) for models trained with OMol25 datasets. The key changes include delaying model instantiation until the specific task domain is known, enabling proper selection of model heads and DFT labels.
- Refactored model gathering to separate parameter collection from model instantiation
- Added support for domain-specific model heads (
model_domainattribute) for molecules and materials - Modified the
calcproperty fromcached_propertyto regular property with lazy initialization and setter support
Reviewed Changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| lambench/workflow/entrypoint.py | Split gather_models() into gather_model_params() and gather_model() to enable delayed model instantiation based on task domain |
| lambench/models/basemodel.py | Added supports_omol and model_domain attributes to support domain-specific model configurations |
| lambench/models/ase_models.py | Changed calc from cached_property to property with setter, added domain-based head selection for UMA and DP calculators, and logic to select appropriate DFT labels |
| lambench/tasks/base_task.py | Modified test_data type to support both single Path and dict of Paths for multiple DFT labels |
| lambench/tasks/direct/direct_tasks.yml | Added AQM task with PBE and wB97 test data paths, moved ANI and MD22 to deprecated section |
| lambench/metrics/utils.py | Updated to use new gather_model_params() and gather_model() functions |
| tests/workflow/test_entrypoint.py | Updated test fixture to return dict instead of DPModel instance, aligned with new parameter-based approach |
| tests/tasks/calculator/test_nve_md.py | Simplified test fixtures by removing separate calculator fixture and accessing calculator through model |
Comments suppressed due to low confidence (1)
lambench/models/ase_models.py:37
- The documentation still describes
calcas a 'cached property', but it has been changed to a regular property with lazy initialization. Update the documentation to reflect this change.
calc (Calculator): A cached property that initializes and returns an ASE Calculator
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #373 +/- ##
==========================================
- Coverage 65.89% 65.75% -0.14%
==========================================
Files 35 35
Lines 1589 1609 +20
Branches 189 194 +5
==========================================
+ Hits 1047 1058 +11
- Misses 500 507 +7
- Partials 42 44 +2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This is a follow up PR of #372 . For OOD-FF datasets, we provide both PBE level and wB97MV level labels as separate datasets. We allow model to select the corresponding data path based on the model branch.