-
Notifications
You must be signed in to change notification settings - Fork 712
Introduce a SplitSequence in BufferedReader to split file content
#3405
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some initial notes, @glbrntt will have more.
| extension BufferedReader.SplitSequence: Sendable where BufferedReader: Sendable {} | ||
|
|
||
| @available(macOS 10.15, iOS 13.0, watchOS 6.0, tvOS 13.0, *) | ||
| extension BufferedReader.SplitSequence.AsyncIterator: Sendable where BufferedReader: Sendable {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These look wrong: I think they need to be when Handle: Sendable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Emmm both are fine. BufferedReader being Sendable implies Handle being Sendable.
This type stores a BufferedReader, so my preference was to not concern it with Handle.
I can use Handle if you prefer that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is definitely a problem that's worth tackling (you are by no means the first to run into this!) but I don't think this is the right approach.
In my mind the more general problem is having an AsyncSequence<ByteBuffer> from which you want decode a sequence of some other type, e.g. you want a AsyncSequence<Decoded>.
I think the appropriate way to do this would not be to add an extension to NIOFileSystem but instead to extend AsyncSequence where the element type is ByteBuffer in NIOCore. The base API could use a NIOSingleStepByteToMessageDecoder and then we could add helpers on top which setup an appropriate decoder.
This would mean having something like:
extension AsyncSequence where Element == ByteBuffer {
public func decode<Decoder: NIOSingleStepByteToMessageDecoder, Decoded>(
using decoder: Decoder
) -> SomeNewAsyncSequenceType<Decoder, Decoded> where Decoder.InboundOut == Decoded {
// ...
}
}For reading lines from a file you'd then use the readChunks API rather than the BufferedReader.
…ces (#3407) Previous PR: #3405 Add an API on top of `AsyncSequence<ByteBuffer>` which can dynamically decode values. ~~Add an API on top of the new `AsyncSequence<ByteBuffer>` APIs which splits the file based on its content.~~ ### Motivation: Provides a nice API to decode files, instead of users having to go though manually handling `BufferedReader.read(while:)`. I struggled with this, as documented in https://swift-open-source.slack.com/archives/C9MMT6VGB/p1760115481607159 ### Modifications: Add `NIODecodedAsyncSequence` + functions on `AsyncSequence<ByteBuffer>` to create such a sequence. ~~Add `NIOSplitMessageDecoder` + stdlib-like functions on `AsyncSequence<ByteBuffer>` to create such a sequence.~~ ### Result: Users can decode an async sequence of `ByteBuffer`s easier. ~~Users can easily split files based on their content.~~ ### Checklist See this comment for a checklist of the remaining things to do: #3407 (comment)
Add an API on top of
BufferedReaderwhich splits the file based on its content.Motivation:
Provides a nice API, instead of users having to go though manually handling
BufferedReader.read(while:).I struggled with this, as documented in https://swift-open-source.slack.com/archives/C9MMT6VGB/p1760115481607159
Modifications:
Add
BufferedReader.SplitSequence+ stdlib-like functions onBufferedReaderto create such a sequence.Result:
Users can easily split files based on their content.