v2.1.24 by joellidin · Pull Request #680 · one-covenant/templar

joellidin · 2026-01-11T15:28:12Z

(hparams) Lower anneal peak LR to 0.25
(neurons) Switch anneal mode to shard 2
Bump run version

Description

Related Issue(s)

Closes #[issue number]

Type of Change

Feature (adding new functionality)
Fix (resolving a bug or issue)
Docs (documentation updates)
Refactor (code changes that don't affect functionality)
Maintenance (dependency updates or other maintenance)
Tests (adding or improving tests)
Breaking change (fix or feature with incompatible API changes)
Other: _____

Branch Naming

My branch follows the project's naming convention (e.g., feature/add-new-capability)

Commit Messages

My commits are small, atomic, and have proper commit messages
Commit messages are in imperative mood with a capitalized summary under 50 chars

Code Quality

I've performed a self-review of my code
I've added appropriate docstrings following the project's conventions
I've added proper logging where necessary (without trailing periods)
I've applied linting and formatting with Ruff
My code generates no new warnings

Testing

I've added tests for new functionality or bug fixes
All tests pass locally with my changes
Test coverage has not decreased

Documentation

I've updated documentation to reflect my changes
I've updated comments in hard-to-understand areas

If this is a breaking change

Screenshots/Examples

Additional Notes

Summary by CodeRabbit

Documentation
- Updated partial-migration example to reference current anneal shard configuration.
Configuration
- Adjusted anneal mode peak learning rate factor from 0.3 to 0.25.
Refactor
- Modified initial shard selection during anneal mode initialization.
Chores
- Version bumped to 2.1.24.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Reduce peak_lr_factor from 0.3 to 0.25 for improved training stability.

Update all neurons to use shard 2 instead of shard 0 for anneal mode. - Change anneal shard in miner and validator - Clarify sharded_dataset.py comment - Update docs example to use shard 2

coderabbitai · 2026-01-11T15:28:31Z

Walkthrough

The PR modifies anneal mode to initialize with shard 2 instead of shard 0 across miner and validator components, updates the documentation example to reflect this change, adjusts the anneal mode hyperparameter peak_lr_factor from 0.3 to 0.25, and increments the package version.

Changes

Cohort / File(s)	Summary
Anneal mode shard initialization `neurons/miner.py`, `neurons/validator.py`	Changes initial dataset shard from 0 to 2 in anneal mode startup; sets `current_shard = 2` and `current_shard_epoch = 0` when dataset_manager.anneal_mode is active
Documentation and comments `docs/shared_sharded_dataset.md`, `src/tplr/sharded_dataset.py`	Updates partial-migration example to reference shard 2 (anneal_000002.npy) instead of shard 0; adjusts inline comment from "we stay on shard 0" to "we stay on one shard"
Configuration and versioning `hparams/hparams.json`, `src/tplr/__init__.py`	Reduces anneal_mode.peak_lr_factor from 0.3 to 0.25; bumps package version from 2.1.23 to 2.1.24

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

feat/new anneal shard #679: Implements identical changes—switching anneal-mode initial shard to 2, updating peak_lr_factor, and version bump.
feat/anneal lower lr #677: Modifies anneal_mode.peak_lr_factor hyperparameter and includes version bump.
fix/shard switching at new run #637: Adds shard-change-based swap logic and last_shard tracking in miner.py and validator.py alongside shard initialization modifications.

Suggested reviewers

shivam-MBZUAI
amiiir-sarfi

Poem

🐰 From shard zero we hop to shard two,
Anneal mode dances in a different view!
Peak_lr_factor turns down with a bound,
Version 2.1.24—new magic we've found! ✨

🚥 Pre-merge checks | ❌ 3

❌ Failed checks (1 warning, 2 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check	❓ Inconclusive	The title 'v2.1.24' is a version bump indicator that lacks specificity about the actual changes being introduced in this pull request.	Consider a more descriptive title that highlights the main changes, such as 'Lower anneal peak LR and switch to shard 2' or similar that indicates the substantive modifications.
Description check	❓ Inconclusive	The description provides bullet-point summaries of changes but lacks detail on the rationale and context; most template sections are left unchecked or incomplete without substantive content.	Expand the description with more context on the 'why' behind changes, particularly for the learning rate adjustment and shard switch; consider filling in relevant checklist items and related issues if applicable.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2026-01-11T15:30:58Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

❌ Your project status has failed because the head coverage (57.74%) is below the target coverage (85.00%). You can increase the head coverage or adjust the target coverage.

@@           Coverage Diff           @@
##             main     #680   +/-   ##
=======================================
  Coverage   57.74%   57.74%           
=======================================
  Files          27       27           
  Lines        4977     4977           
=======================================
  Hits         2874     2874           
  Misses       2103     2103

Files with missing lines	Coverage Δ
src/tplr/__init__.py	`100.00% <100.00%> (ø)`
src/tplr/sharded_dataset.py	`22.43% <ø> (ø)`

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

neurons/validator.py (1)
1261-1273: Verify shard-2 artifacts are deployed; hardcoding is acceptable if configuration isn't needed.

Shard-2 artifacts (anneal_000002.npy / sample_ids_anneal_000002.npy) are properly documented in the dataset setup guide, and no stale references to shard 0 in anneal mode exist in the codebase. If these files are guaranteed to be available in all deployment environments, the hardcoding is safe for this PR.
Optional: make anneal shard configurable (consistent with existing anneal_config pattern)

The codebase already uses anneal_config.get(key, default) extensively. Consider adding shard_index as a configuration option:
--- a/neurons/validator.py
+++ b/neurons/validator.py
@@ -1266,7 +1266,7 @@ class Validator:
         # In anneal mode, always use shard 2
         if self.dataset_manager.anneal_mode:
-            current_shard = 2
+            current_shard = anneal_config.get("shard_index", 2)
             shard_epoch = 0

🧹 Nitpick comments (1)

neurons/miner.py (1)
432-447: Miner/validator shard selection is now consistent (anneal shard 2).

Main thing to double-check is that shard 2 is universally present/accessible for anneal-mode datasets (otherwise miners will fail early on startup).
Optional: match validator and read shard index from hparams
--- a/neurons/miner.py
+++ b/neurons/miner.py
@@
-        # In anneal mode, always use shard 2
+        # In anneal mode, lock to a single shard (default: 2)
         if self.dataset_manager.anneal_mode:
-            current_shard = 2
+            current_shard = int(anneal_config.get("shard_index", 2))
             current_shard_epoch = 0

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a948591 and 1fd65b4.

📒 Files selected for processing (6)

docs/shared_sharded_dataset.md
hparams/hparams.json
neurons/miner.py
neurons/validator.py
src/tplr/__init__.py
src/tplr/sharded_dataset.py

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: test (3.12)
GitHub Check: test (3.11)

🔇 Additional comments (4)

hparams/hparams.json (1)

3-11: Anneal LR peak reduction looks fine; verify effective LR peak + stability.

Given outer_learning_rate: 0.4, confirm the new anneal peak (outer_lr * peak_lr_factor) is the intended magnitude and doesn’t under-train during anneal runs.

src/tplr/sharded_dataset.py (1)

395-402: Comment update matches new anneal behavior.

src/tplr/__init__.py (1)

23-23: Version bump to 2.1.24 is consistent with the PR title.

docs/shared_sharded_dataset.md (1)

149-154: Documentation correctly reflects shard 2 migration.

The updated section accurately documents the testing workflow with shard 2, aligning with the code changes that now initialize anneal mode using shard 2 instead of shard 0. The rclone commands and file references (anneal_000002.npy) are correct.

joellidin added 4 commits January 11, 2026 03:38

(hparams) Lower anneal peak LR to 0.25

ddb2b52

Reduce peak_lr_factor from 0.3 to 0.25 for improved training stability.

(neurons) Switch anneal mode to shard 2

e1e76c3

Update all neurons to use shard 2 instead of shard 0 for anneal mode. - Change anneal shard in miner and validator - Clarify sharded_dataset.py comment - Update docs example to use shard 2

Bump run version

8de0c27

feat/new anneal shard (#679)

1fd65b4

coderabbitai bot reviewed Jan 11, 2026

View reviewed changes

joellidin merged commit 3dfeec1 into main Jan 11, 2026
7 of 8 checks passed

This was referenced Jan 17, 2026

feat/new anneal shard #683

Merged

fix/nan loss #685

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v2.1.24#680

v2.1.24#680
joellidin merged 4 commits intomainfrom
dev

joellidin commented Jan 11, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jan 11, 2026 •

edited

Loading

Uh oh!

codecov bot commented Jan 11, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

joellidin commented Jan 11, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issue(s)

Type of Change

Branch Naming

Commit Messages

Code Quality

Testing

Documentation

If this is a breaking change

Screenshots/Examples

Additional Notes

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

codecov bot commented Jan 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

joellidin commented Jan 11, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 11, 2026 •

edited

Loading

codecov bot commented Jan 11, 2026 •

edited

Loading