Skip to content

Commit 9c8e092

Browse files
committed
kvserver: handle context cancellation when receiving snapshots
Following a3fd4fb, context errors are now propagated back up the stack when receiving snapshots. However, this triggered a `maybeFatalOnRaftReadyErr` assertion which crashed the node. `handleRaftReadyRaftMuLocked` (which is called directly when applying snapshots) does not appear prepared to safely handle arbitrary context cancellation, as it is typically called with the Raft scheduler's background context. Furthermore, by the time we call it we have already received the entire snapshot, so it does not seem useful to abort snapshot application just because the caller goes away. This patch therefore uses a new background context for applying snapshots, disconnected from the client's context, once the entire snapshot has been received. Release note: None
1 parent 506d129 commit 9c8e092

2 files changed

Lines changed: 8 additions & 2 deletions

File tree

pkg/kv/kvserver/store_raft.go

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -338,14 +338,14 @@ func (s *Store) processRaftSnapshotRequest(
338338
var expl string
339339
var err error
340340
stats, expl, err = r.handleRaftReadyRaftMuLocked(ctx, inSnap)
341+
maybeFatalOnRaftReadyErr(ctx, expl, err)
341342
if !stats.snap.applied {
342343
// This line would be hit if a snapshot was sent when it isn't necessary
343344
// (i.e. follower was able to catch up via the log in the interim) or when
344345
// multiple snapshots raced (as is possible when raft leadership changes
345346
// and both the old and new leaders send snapshots).
346347
log.Infof(ctx, "ignored stale snapshot at index %d", snapHeader.RaftMessageRequest.Message.Snapshot.Metadata.Index)
347348
}
348-
maybeFatalOnRaftReadyErr(ctx, expl, err)
349349
return nil
350350
})
351351
}

pkg/kv/kvserver/store_snapshot.go

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -660,7 +660,13 @@ func (s *Store) receiveSnapshot(
660660
return err
661661
}
662662
inSnap.placeholder = placeholder
663-
if err := s.processRaftSnapshotRequest(ctx, header, inSnap); err != nil {
663+
664+
// Use a background context for applying the snapshot, as handleRaftReady is
665+
// not prepared to deal with for arbitrary context cancellation. We've already
666+
// received the entire snapshot at this point too, so there's no point in
667+
// abandoning application half-way through if the caller goes away.
668+
applyCtx := s.AnnotateCtx(context.Background())
669+
if err := s.processRaftSnapshotRequest(applyCtx, header, inSnap); err != nil {
664670
return sendSnapshotError(stream, errors.Wrap(err.GoError(), "failed to apply snapshot"))
665671
}
666672
return stream.Send(&SnapshotResponse{Status: SnapshotResponse_APPLIED})

0 commit comments

Comments
 (0)