Skip to content

v2.1.22#676

Merged
joellidin merged 3 commits intomainfrom
dev
Jan 6, 2026
Merged

v2.1.22#676
joellidin merged 3 commits intomainfrom
dev

Conversation

@joellidin
Copy link
Copy Markdown
Collaborator

@joellidin joellidin commented Jan 6, 2026

  • (neurons) Fix LR discontinuity in anneal scheduler
  • Bump run version

Description

Related Issue(s)

  • Closes #[issue number]

Type of Change

  • Feature (adding new functionality)
  • Fix (resolving a bug or issue)
  • Docs (documentation updates)
  • Refactor (code changes that don't affect functionality)
  • Maintenance (dependency updates or other maintenance)
  • Tests (adding or improving tests)
  • Breaking change (fix or feature with incompatible API changes)
  • Other: _____

Branch Naming

  • My branch follows the project's naming convention (e.g., feature/add-new-capability)

Commit Messages

  • My commits are small, atomic, and have proper commit messages
  • Commit messages are in imperative mood with a capitalized summary under 50 chars

Code Quality

  • I've performed a self-review of my code
  • I've added appropriate docstrings following the project's conventions
  • I've added proper logging where necessary (without trailing periods)
  • I've applied linting and formatting with Ruff
  • My code generates no new warnings

Testing

  • I've added tests for new functionality or bug fixes
  • All tests pass locally with my changes
  • Test coverage has not decreased

Documentation

  • I've updated documentation to reflect my changes
  • I've updated comments in hard-to-understand areas

If this is a breaking change

Screenshots/Examples

Additional Notes

Summary by CodeRabbit

  • Bug Fixes

    • Refined learning rate decay configuration to ensure smoother training dynamics without discontinuities.
  • Chores

    • Version bumped to 2.1.22.

✏️ Tip: You can customize this high-level summary in your review settings.

Fix learning rate jump when transitioning from warmup to decay phase in
the anneal scheduler. Previously, decay started at 1.0x base_lr while
warmup ended at peak_lr_factor, causing a sudden LR increase.

Change decay scheduler's start_factor from 1.0 to peak_lr_factor,
ensuring continuity. Both start_factor and end_factor are now relative
to base_lr, eliminating the discontinuity.
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Jan 6, 2026

Walkthrough

The pull request adjusts learning rate decay scheduler parameters in get_inner_scheduler by removing the intermediate ratio computation and directly using peak_lr_factor and eta_min_factor as start and end values. The package version is bumped from 2.1.21 to 2.1.22.

Changes

Cohort / File(s) Summary
Learning Rate Decay Configuration
neurons/trainer.py
Modified decay phase setup in get_inner_scheduler: removed end_factor = eta_min_factor / peak_lr_factor ratio calculation and replaced with direct assignment of start_factor = peak_lr_factor and end_factor = eta_min_factor, eliminating LR discontinuity. Updated comments clarify both factors are relative to base learning rate.
Version Bump
src/tplr/__init__.py
Updated __version__ from "2.1.21" to "2.1.22".

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Suggested reviewers

  • shivam-MBZUAI

Poem

🐰 A whisper through the training glade,
Where learning rates no longer fade,
From peak to min, a smoother way,
No jumps, no breaks—hooray, hooray!
Version bumps and schedules bright,

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings, 1 inconclusive)
Check name Status Explanation Resolution
Title check ⚠️ Warning The title 'v2.1.22' is a version number and does not describe the actual changes in the pull request (LR discontinuity fix and version bump). Use a descriptive title that explains the main changes, such as 'Fix LR discontinuity in anneal scheduler and bump version to v2.1.22'.
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Description check ❓ Inconclusive The PR description includes bullet points summarizing changes but lacks substantive detail in most template sections; most checkboxes remain unchecked and several sections are empty or incomplete. Complete the description by filling in the Description section with details on what changed and why, check relevant Type of Change boxes, and provide clarity on testing and review status.
✨ Finishing touches
  • 📝 Generate docstrings

📜 Recent review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e96eb01 and 0b4660a.

📒 Files selected for processing (2)
  • neurons/trainer.py
  • src/tplr/__init__.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: test (3.12)
  • GitHub Check: test (3.11)
🔇 Additional comments (2)
src/tplr/__init__.py (1)

23-23: LGTM! Version bump is appropriate.

The patch version increment correctly reflects the bug fix for the LR discontinuity in the anneal scheduler.

neurons/trainer.py (1)

181-187: Excellent fix for the LR discontinuity issue in the anneal scheduler.

The change correctly addresses the discontinuity at the warmup→decay transition:

  • Warmup phase: Scales from 1e-6 to peak_lr_factor (e.g., 0.5) over warmup_inner_steps
  • Decay phase: Scales from peak_lr_factor to eta_min_factor over total_decay_steps
  • Result: The decay phase starts at exactly the same learning rate multiplier where warmup ended, ensuring continuous transitions

The comment accurately reflects that both factors are relative to base_lr, making the scheduler behavior clear and preventing the previously discontinuous jump.

Note: The codebase currently lacks specific test coverage for the anneal scheduler mode. Consider adding a test that validates LR continuity at the warmup→decay transition (e.g., by checking that get_last_lr() is identical before and after the transition point at step warmup_inner_steps).


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov bot commented Jan 6, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

❌ Your project status has failed because the head coverage (57.74%) is below the target coverage (85.00%). You can increase the head coverage or adjust the target coverage.

Impacted file tree graph

@@           Coverage Diff           @@
##             main     #676   +/-   ##
=======================================
  Coverage   57.74%   57.74%           
=======================================
  Files          27       27           
  Lines        4977     4977           
=======================================
  Hits         2874     2874           
  Misses       2103     2103           
Files with missing lines Coverage Δ
src/tplr/__init__.py 100.00% <100.00%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@joellidin joellidin merged commit de576bb into main Jan 6, 2026
7 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant