Offload replication writes to IO threads #1485

uriyage · 2024-12-24T07:17:24Z

This PR offloads the write to replica clients to IO threads.

Main Changes

Replica writes will be offloaded but only after the replica is in online mode..
Replica reads will still be done in the main thread to reduce complexity and because read traffic from replicas is negligible.

Implementation Details

In order to offload the writes, writeToReplica has been split into 2 parts:

The write itself made by the IO thread or by the main thread
The post write where we update the replication buffers refcount will be done in the main-thread after the write-job is done in the IO thread (similar to what we do with a regular client)

Additional Changes

In writeToReplica we now use writev in case more than 1 buffer exists.
Changed client nwritten field to ssize_t since with a replica the nwritten can theoretically exceed int size (not subject to NET_MAX_WRITES_PER_EVENT limit).
Changed parsing code to use memchr instead of strchr:
- During parsing command, ASAN got stuck for unknown reason when called to strchr to look for the next \r
- Adding assert for null-terminated querybuf didn't resolve the issue.
- Switched to memchr as it's more secure and resolves the issue

Testing

Added integration tests
Added unit tests

Related issue: #761

codecov · 2024-12-24T07:39:00Z

Codecov Report

Attention: Patch coverage is 80.45977% with 17 lines in your changes missing coverage. Please review.

Project coverage is 71.06%. Comparing base (0579103) to head (b445576).
Report is 6 commits behind head on unstable.

Files with missing lines	Patch %	Lines
src/io_threads.c	0.00%	12 Missing ⚠️
src/networking.c	93.33%	5 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##           unstable    #1485      +/-   ##
============================================
- Coverage     71.06%   71.06%   -0.01%     
============================================
  Files           121      121              
  Lines         65237    65311      +74     
============================================
+ Hits          46360    46411      +51     
- Misses        18877    18900      +23

Files with missing lines	Coverage Δ
src/replication.c	`87.39% <ø> (+0.11%)`	⬆️
src/server.h	`100.00% <ø> (ø)`
src/networking.c	`88.88% <93.33%> (-0.08%)`	⬇️
src/io_threads.c	`6.82% <0.00%> (-0.12%)`	⬇️

... and 14 files with indirect coverage changes

src/networking.c

src/unit/test_networking.c

xbasel

fyi - I tested this code and I got this error. I did not see it with HEAD~1

2576783:S 06 Jan 2025 16:06:12.381 # Protocol error (Master using the inline protocol. Desync?) from client: id=6 addr=127.0.0.1:6379 laddr=127.0.0.1:56284 fd=11 name=*redacted* age=43 idle=0 flags=M db=0 sub=0 psub=0 ssub=0 multi=-1 watch=0 qbuf=22870 qbuf-free=18084 argv-mem=0 multi-mem=0 rbs=1024 rbp=42 obl=0 oll=0 omem=0 tot-mem=42880 events=r cmd=set user=*redacted* redir=-1 resp=2 lib-name= lib-ver= tot-net-in=218001645 tot-net-out=1809 tot-cmds=1267450. Query buffer during protocol error: 'SET..$16..key:__rand_int__..$128..VXKeHogKgJ=[5V9_X^b?48OKF2jGA<' (... more 896 bytes ...) 'mcS2^N1J?ELSX@CfKQ7cM5aea\ngY8a3LGgNVa9eRA46XS8>7ABe1>Jl9O\Rm\..'

I did run: src/valkey-benchmark -t set -d 128 -n 5000000 --threads 10
valkey with 4 io threads

Looks like data is corrupted, look at the second message:
*3\r $3\r SET\r $16\r key:000000630274\r $128\r VXKeHogKgJ=[5V9_X^b?48OKF2jGA<f:iR@50o7dS3JV4Q6L68lC[GTA]0DaMg?_oSmcS2^N1J?ELSX@CfKQ7cM5aea\\ngY8a3LGgNVa9eRA46XS8>7ABe1>Jl9O\\Rm\\\r *3\r $3\r SET\r $16\r key:000000420097\r $128\r VXKeHogKgJ=[5V9_X^b?48OKF2jGA<f:iR@50o7dS3JV4Q6L68lC[GTA]0DaMg?_oSmcS2^N1J?ELSX@CfKQ7cM5aea\\ngY8a3LGgNVa9eRA46XS8>7ABe1>Jl9O\\Rm\\\r o7dS3JV4Q6L68lC[GTA]0DaMg?_oSmcS2^N1J?ELSX@CfKQ7cM5aea\\ngY8a3LGgNVa9eRA46XS8>7ABe1>Jl9O\\Rm\\\r

The payload o7dS3JV4Q6L68lC[GTA]0DaMg?_oSmcS2^N1J?ELSX@CfKQ7cM5aea\\ngY8a3LGgNVa9eRA46XS8>7ABe1>Jl9O\\Rm\\\r is duplicated.

uriyage · 2025-01-08T19:11:20Z

fyi - I tested this code and I got this error. I did not see it with HEAD~1

Many thanks @xbasel for finding it.
I fixed the issue, it was a stop condition that was mistakenly removed.

tests/unit/networking.tcl

src/replication.c

src/networking.c

src/io_threads.c

src/unit/test_networking.c

src/networking.c

ranshid

@uriyage Please just resolve the conlicts. We will merge it right after the tests are passing

Signed-off-by: Uri Yagelnik <[email protected]>

uriyage · 2025-02-06T16:27:25Z

@uriyage Please just resolve the conlicts. We will merge it right after the tests are passing

@ranshid, Done.

Signed-off-by: Uri Yagelnik <[email protected]>

zuiderkwast · 2025-02-10T08:31:39Z

Use-after-free in unit test (ASAN & Valgrind), in freeReplicaReferencedReplBuffer.

https://github.com/valkey-io/valkey/actions/runs/13230922385/job/36927928217#step:10:147

uriyage · 2025-02-10T16:32:01Z

Use-after-free in unit test (ASAN & Valgrind), in freeReplicaReferencedReplBuffer.

https://github.com/valkey-io/valkey/actions/runs/13230922385/job/36927928217#step:10:147

Fixed in: https://github.com/valkey-io/valkey/pull/1697/files

This PR offloads the write to replica clients to IO threads. ## Main Changes * Replica writes will be offloaded but only after the replica is in online mode.. * Replica reads will still be done in the main thread to reduce complexity and because read traffic from replicas is negligible. ### Implementation Details In order to offload the writes, `writeToReplica` has been split into 2 parts: 1. The write itself made by the IO thread or by the main thread 2. The post write where we update the replication buffers refcount will be done in the main-thread after the write-job is done in the IO thread (similar to what we do with a regular client) ### Additional Changes * In `writeToReplica` we now use `writev` in case more than 1 buffer exists. * Changed client `nwritten` field to `ssize_t` since with a replica the `nwritten` can theoretically exceed `int` size (not subject to `NET_MAX_WRITES_PER_EVENT` limit). * Changed parsing code to use `memchr` instead of `strchr`: * During parsing command, ASAN got stuck for unknown reason when called to `strchr` to look for the next `\r` * Adding assert for null-terminated querybuf didn't resolve the issue. * Switched to `memchr` as it's more secure and resolves the issue ### Testing * Added integration tests * Added unit tests **Related issue:** valkey-io#761 --------- Signed-off-by: Uri Yagelnik <[email protected]>

uriyage force-pushed the replication-offload branch from 3aee49b to 846c816 Compare December 24, 2024 07:24

xbasel reviewed Jan 2, 2025

View reviewed changes

src/networking.c Show resolved Hide resolved

xbasel reviewed Jan 5, 2025

View reviewed changes

src/unit/test_networking.c Show resolved Hide resolved

xbasel reviewed Jan 6, 2025

View reviewed changes

arukiidou mentioned this pull request Jan 7, 2025

Followup items from #758 Async IO threads #761

Open

10 tasks

madolson requested review from madolson and ranshid February 3, 2025 15:56

madolson reviewed Feb 4, 2025

View reviewed changes

tests/unit/networking.tcl Outdated Show resolved Hide resolved

madolson reviewed Feb 4, 2025

View reviewed changes

src/replication.c Outdated Show resolved Hide resolved

madolson reviewed Feb 4, 2025

View reviewed changes

src/networking.c Outdated Show resolved Hide resolved

src/networking.c Show resolved Hide resolved

src/io_threads.c Show resolved Hide resolved

src/unit/test_networking.c Outdated Show resolved Hide resolved

src/networking.c Show resolved Hide resolved

madolson approved these changes Feb 5, 2025

View reviewed changes

ranshid approved these changes Feb 6, 2025

View reviewed changes

uriyage added 3 commits February 6, 2025 15:39

Support primary IO offload to IO threads

f1bbe3f

Signed-off-by: Uri Yagelnik <[email protected]>

Addressed PR comments

ded611a

Signed-off-by: Uri Yagelnik <[email protected]>

Offload replication writes to IO threads

0422552

Signed-off-by: Uri Yagelnik <[email protected]>

uriyage force-pushed the replication-offload branch from 76576ac to 2fb5d11 Compare February 6, 2025 16:26

uriyage added 4 commits February 9, 2025 09:36

fix incorrect stop condition

796626e

Signed-off-by: Uri Yagelnik <[email protected]>

Addressed PR comments

091399e

Signed-off-by: Uri Yagelnik <[email protected]>

Resolve merge conflicts

7399755

Signed-off-by: Uri Yagelnik <[email protected]>

Resolve unittests merge failures

46202aa

Signed-off-by: Uri Yagelnik <[email protected]>

uriyage force-pushed the replication-offload branch from 2fb5d11 to 46202aa Compare February 9, 2025 09:37

fix clang format

b445576

Signed-off-by: Uri Yagelnik <[email protected]>

ranshid merged commit de3672a into valkey-io:unstable Feb 9, 2025
50 checks passed

zuiderkwast mentioned this pull request Aug 15, 2025

Don't call SSL_write() with num=0 #2490

Merged

sundb mentioned this pull request Oct 5, 2025

Lookahead pre-fetching sundb/redis#413

Open

Offload replication writes to IO threads #1485

Offload replication writes to IO threads #1485

Uh oh!

Conversation

uriyage commented Dec 24, 2024 • edited by zuiderkwast Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Main Changes

Implementation Details

Additional Changes

Testing

Uh oh!

codecov bot commented Dec 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

xbasel left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

uriyage commented Jan 8, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ranshid left a comment

Choose a reason for hiding this comment

Uh oh!

uriyage commented Feb 6, 2025

Uh oh!

Uh oh!

zuiderkwast commented Feb 10, 2025

Uh oh!

uriyage commented Feb 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

uriyage commented Dec 24, 2024 •

edited by zuiderkwast

Loading

codecov bot commented Dec 24, 2024 •

edited

Loading

xbasel left a comment •

edited

Loading