-
Notifications
You must be signed in to change notification settings - Fork 197
Return a single element while iterating over segments #523
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
36adba5 to
a1732ea
Compare
d0170d0 to
4cd5320
Compare
This makes the usage of the segment cursor more natural:
```
cur = conn.cursor('segment')
cur.execute(sql)
spooled_result = cur.fetchall()
for spooled_data in spooled_result:
row_mapper = RowMapperFactory().create(columns=cur._query.columns, legacy_primitive_types=False)
rows = list(SegmentIterator(spooled_data, row_mapper))
```
4cd5320 to
c41a3d0
Compare
| legacy_primitive_types=self._legacy_primitive_types, | ||
| fetch_mode="segments") | ||
| self._iterator = iter(self._query.execute()) | ||
| self._iterator = map(lambda tuple: tuple[0], iter(self._query.execute())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note this is a breaking change, but i agree it is a nicer API.
However, how can the client retrieve the encoding?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the usage in the integration test
| for spooled_data, spooled_segment in rows: | ||
| segments = cur.fetchall() | ||
| assert len(segments) > 0 | ||
| row_mapper = trino.mapper.RowMapperFactory().create(columns=cur._query.columns, legacy_primitive_types=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change import to from trino.mapper import RowMapperFactory
| def __next__(self) -> Tuple["SpooledData", "Segment"]: | ||
| return self, next(self._segments_iterator) | ||
| segment = next(self._segments_iterator) | ||
| return SpooledData(self._encoding, [segment]), segment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not also change SpooledData to contain a single segment, rather than a list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because SpooledData can contain multiple segments. I think we should not reuse this object as the iterable object but return another class that has all the required fields. Let's get rid of Tuple["SpooledData", "Segment"].
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, each response can contain multiple segment URLs, so we need the transfer object to reflect that, but the abstraction we expose to the client doesn't need to mirror that.
This makes the usage of the segment cursor more natural:
This allows for the usage that is intuitive to the user:
This will produce: