Skip to content

Conversation

@mkalimeri
Copy link
Contributor

@mkalimeri mkalimeri commented Jan 22, 2025

Description

Related to #596

Examples added for

  • model_selection.TimeGapSplit
  • model_selection.GroupTimeSeriesSplit
  • model_selection.ClusterFoldValidation (I see that KlusterFoldValidation has been renamed as ClusterFoldValidation)

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Checklist:

  • My code follows the style guidelines (ruff)
  • I have commented my code, particularly in hard-to-understand areas
  • [NA] I have made corresponding changes to the documentation (also to the readme.md)
  • [NA] I have added tests that prove my fix is effective or that my feature works
  • [NA] I have added tests to check whether the new feature adheres to the sklearn convention
  • New and existing unit tests pass locally with my changes

If you feel your PR is ready for a review, ping @FBruzzesi or @koaning.

@FBruzzesi FBruzzesi changed the title Update model_selection.py docs: model_selection module docstrings Jan 24, 2025
Copy link
Collaborator

@FBruzzesi FBruzzesi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @mkalimeri!

The only concern I have is related to the part of code in TimeGapSplit summary. @koaning WDYT?
We can keep it as a follow up, but that's certainly not the right assumption to make in this context

1. Updated code in TimeGapSplit: corrected the calculation of total number of days in the fold and added a 'frequency' field in the summary, to show the frequency of the data
2. Updated TimeGapSplit example: data frequency is now in hours and the parameters are in days.
3. Updated pytest for summary calculation
@mkalimeri
Copy link
Contributor Author

I updated the code based on the feedback from the previous review. I kept the summary field 'num of days' and updated the calculation to reflect the actual number of days. I also added a field 'frequency', to inform about the frequency of the data. I updated example/test to reflect these changes. Let me know if you can think of other informative fields!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants