Fix migration might skip some records on big table after job restarting by sandynz · Pull Request #36878 · apache/shardingsphere

sandynz · 2025-10-14T08:11:20Z

Related to #33996.

Changes proposed in this pull request:

Remove InventoryDumperContext.firstDump

Before committing this PR, I'm sure that I have checked the following options:

My code follows the code of conduct of this project.
I have self-reviewed the commit code.
I have (or in comment I request) added corresponding labels for the pull request.
I have passed maven check locally : ./mvnw clean install -B -T1C -Dmaven.javadoc.skip -Dmaven.jacoco.skip -e.
I have made corresponding changes to the documentation.
I have added corresponding unit tests for my changes.
I have updated the Release Notes of the current development version. For more details, see Update Release Note

sandynz · 2025-11-12T03:33:29Z

InventoryDumperContext.firstDump might cause bug.

Test case

Table and configuration:

tableA with 300 million records, with integer primary key id AUTO_INCREMENT. Use default SHARDING_SIZE 10 million, then there could be 30 records shards.
Use default WORKER_THREAD 20.

Steps to reproduce

Start migration job. (There will be 20 threads to do migration concurrently, and 10 shards tasks are queued)
After job running a very short time, run show migration status {jobId};, make sure processed_records_count is greater than 0 and inventory_finished_percentage is less than 10%.
Restart migration job.

Reason analyze

When job is stopped and there are still some shards tasks are queued, after job is started again, InventoryDumperContext.firstDump will return false (check on job item level, not on every shard task). When queued shards tasks running, query SQL condition will use >, so a shard task will skip one record, >= should be used at the first dump for every shard task.

Remove InventoryDumperContext.firstDump

819ac52

sandynz added this to the 5.5.3 milestone Oct 14, 2025

sandynz added type: enhancement feature: pipeline labels Oct 14, 2025

taojintianxia approved these changes Oct 14, 2025

View reviewed changes

sandynz merged commit a65eb5f into apache:master Oct 14, 2025
24 checks passed

sandynz deleted the pipeline-1 branch October 14, 2025 08:34

sandynz added the type: bug label Nov 12, 2025

sandynz changed the title ~~Improve InventoryDumper query range after job restarting~~ Fix migration might skip some records on big table after job restarting Nov 12, 2025

RaigorJiang mentioned this pull request Jan 10, 2026

[DISCUSS] Release Plan of ShardingSphere 5.5.3 #37700

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix migration might skip some records on big table after job restarting#36878

Fix migration might skip some records on big table after job restarting#36878
sandynz merged 1 commit intoapache:masterfrom
sandynz:pipeline-1

sandynz commented Oct 14, 2025

Uh oh!

Uh oh!

sandynz commented Nov 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sandynz commented Oct 14, 2025

Uh oh!

Uh oh!

sandynz commented Nov 12, 2025

Test case

Table and configuration:

Steps to reproduce

Reason analyze

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants