Skip to content

Conversation

@yesmey
Copy link
Contributor

@yesmey yesmey commented Sep 14, 2022

Refactored StreamReader ReadLine/ReadLineAsync to use Span.IndexOfAny

dotnet/performance benchmarks
BenchmarkDotNet=v0.13.2, OS=Windows 10 (10.0.19044.2006/21H2/November2021Update)
AMD Ryzen 7 3700X, 1 CPU, 16 logical and 8 physical cores
.NET SDK=7.0.100-preview.7.22377.5
  [Host]     : .NET 7.0.0 (7.0.22.37506), X64 RyuJIT AVX2
  Job-QTLAVP : .NET 8.0.0 (42.42.42.42424), X64 RyuJIT AVX2
  Job-OYFQVY : .NET 8.0.0 (42.42.42.42424), X64 RyuJIT AVX2

Method Branch LineLengthRange Mean Ratio
ReadLine main [ 0, 0] 46.832 μs 1.00
ReadLine PR [ 0, 0] 93.316 μs 1.99
ReadLineAsync main [ 0, 0] 257.142 μs 1.00
ReadLineAsync PR [ 0, 0] 279.500 μs 1.09
ReadLine main [ 0, 1024] 13.952 μs 1.00
ReadLine PR [ 0, 1024] 6.018 μs 0.43
ReadLineAsync main [ 0, 1024] 19.032 μs 1.00
ReadLineAsync PR [ 0, 1024] 7.964 μs 0.42
ReadLine main [ 1, 1] 64.128 μs 1.00
ReadLine PR [ 1, 1] 91.517 μs 1.43
ReadLineAsync main [ 1, 1] 176.924 μs 1.00
ReadLineAsync PR [ 1, 1] 224.320 μs 1.27
ReadLine main [ 1, 8] 54.430 μs 1.00
ReadLine PR [ 1, 8] 60.738 μs 1.12
ReadLineAsync main [ 1, 8] 123.299 μs 1.00
ReadLineAsync PR [ 1, 8] 128.427 μs 1.04
ReadLine main [ 9, 32] 26.354 μs 1.00
ReadLine PR [ 9, 32] 19.343 μs 0.73
ReadLineAsync main [ 9, 32] 47.567 μs 1.00
ReadLineAsync PR [ 9, 32] 42.869 μs 0.90
ReadLine main [ 33, 128] 19.228 μs 1.00
ReadLine PR [ 33, 128] 8.441 μs 0.43
ReadLineAsync main [ 33, 128] 25.486 μs 1.00
ReadLineAsync PR [ 33, 128] 16.054 μs 0.63
ReadLine main [ 129, 1024] 13.449 μs 1.00
ReadLine PR [ 129, 1024] 5.831 μs 0.43
ReadLineAsync main [ 129, 1024] 19.295 μs 1.00
ReadLineAsync PR [ 129, 1024] 7.740 μs 0.40
ReadLine main [1025, 2048] 15.227 μs 1.00
ReadLine PR [1025, 2048] 6.938 μs 0.46
ReadLineAsync main [1025, 2048] 19.946 μs 1.00
ReadLineAsync PR [1025, 2048] 8.205 μs 0.41

Source

There are regressions in benchmarks with close continuous line-endings (e.g [0,0] which is "\r\n\r\n" repeated), because IndexOfAny will almost always try to use SIMD - since the input buffer is typically large enough to fit in a vector.
I don't know if its a very common scenario though

@ghost ghost added area-System.IO community-contribution Indicates that the PR has been added by a community member labels Sep 14, 2022
@ghost
Copy link

ghost commented Sep 14, 2022

Tagging subscribers to this area: @dotnet/area-system-io
See info in area-owners.md if you want to be subscribed.

Issue Details

Refactored StreamReader ReadLine/ReadLineAsync to use Span.IndexOfAny

dotnet/performance benchmarks
BenchmarkDotNet=v0.13.2, OS=Windows 10 (10.0.19044.2006/21H2/November2021Update)
AMD Ryzen 7 3700X, 1 CPU, 16 logical and 8 physical cores
.NET SDK=7.0.100-preview.7.22377.5
  [Host]     : .NET 7.0.0 (7.0.22.37506), X64 RyuJIT AVX2
  Job-QTLAVP : .NET 8.0.0 (42.42.42.42424), X64 RyuJIT AVX2
  Job-OYFQVY : .NET 8.0.0 (42.42.42.42424), X64 RyuJIT AVX2

Method Branch LineLengthRange Mean Ratio
ReadLine main [ 0, 0] 46.832 μs 1.00
ReadLine PR [ 0, 0] 93.316 μs 1.99
ReadLineAsync main [ 0, 0] 257.142 μs 1.00
ReadLineAsync PR [ 0, 0] 279.500 μs 1.09
ReadLine main [ 0, 1024] 13.952 μs 1.00
ReadLine PR [ 0, 1024] 6.018 μs 0.43
ReadLineAsync main [ 0, 1024] 19.032 μs 1.00
ReadLineAsync PR [ 0, 1024] 7.964 μs 0.42
ReadLine main [ 1, 1] 64.128 μs 1.00
ReadLine PR [ 1, 1] 91.517 μs 1.43
ReadLineAsync main [ 1, 1] 176.924 μs 1.00
ReadLineAsync PR [ 1, 1] 224.320 μs 1.27
ReadLine main [ 1, 8] 54.430 μs 1.00
ReadLine PR [ 1, 8] 60.738 μs 1.12
ReadLineAsync main [ 1, 8] 123.299 μs 1.00
ReadLineAsync PR [ 1, 8] 128.427 μs 1.04
ReadLine main [ 9, 32] 26.354 μs 1.00
ReadLine PR [ 9, 32] 19.343 μs 0.73
ReadLineAsync main [ 9, 32] 47.567 μs 1.00
ReadLineAsync PR [ 9, 32] 42.869 μs 0.90
ReadLine main [ 33, 128] 19.228 μs 1.00
ReadLine PR [ 33, 128] 8.441 μs 0.43
ReadLineAsync main [ 33, 128] 25.486 μs 1.00
ReadLineAsync PR [ 33, 128] 16.054 μs 0.63
ReadLine main [ 129, 1024] 13.449 μs 1.00
ReadLine PR [ 129, 1024] 5.831 μs 0.43
ReadLineAsync main [ 129, 1024] 19.295 μs 1.00
ReadLineAsync PR [ 129, 1024] 7.740 μs 0.40
ReadLine main [1025, 2048] 15.227 μs 1.00
ReadLine PR [1025, 2048] 6.938 μs 0.46
ReadLineAsync main [1025, 2048] 19.946 μs 1.00
ReadLineAsync PR [1025, 2048] 8.205 μs 0.41

Source

There are regressions in benchmarks with close continuous line-endings (e.g [0,0] which is "\r\n\r\n" repeated), because IndexOfAny will almost always try to use SIMD - since the input buffer is typically large enough to fit in a vector.
I don't know if its a very common scenario though

Author: yesmey
Assignees: -
Labels:

area-System.IO

Milestone: -

@stephentoub
Copy link
Member

Thank you. There's already a PR doing this, though:
#69888

@danmoseley
Copy link
Member

cc @GrabYourPitchforks

@yesmey
Copy link
Contributor Author

yesmey commented Sep 15, 2022

Can't believe I missed it. I clearly need to get better at going over open pull requests!
No worries though. Feel free to close this in favor for 69888 and @GrabYourPitchforks can just check if there's something useful to bring over when he has time to get back to it again

@stephentoub
Copy link
Member

Can't believe I missed it. I clearly need to get better at going over open pull requests!

No worries at all. We're striving to do better at reducing our open PR count.

Feel free to close this in favor for 69888

Ok. Thanks, again.

@danmoseley
Copy link
Member

@yesmey I know @GrabYourPitchforks has another project currently. If you're interested in looking at the other PR to help nudge it along, that would more probably be welcome.

@ghost ghost locked as resolved and limited conversation to collaborators Oct 15, 2022
@yesmey yesmey deleted the streamreadline branch May 9, 2023 18:11
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

area-System.IO community-contribution Indicates that the PR has been added by a community member

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants