-
Notifications
You must be signed in to change notification settings - Fork 142
Query paging API: hardening, robustness, explicitness #1061
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
See the following report for details: cargo semver-checks output |
0966edf to
76b1fae
Compare
009283e to
ee73b22
Compare
muzarski
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some nits, otherwise LGTM
ee73b22 to
a7aa56f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like the first commit. Using invalid values for page size was never shown to be a problem / footgun for our users, so advantages from strong typing are quite limited. Otoh it introduces unnecessary complications in each call to set_page_size.
Imo in case of such obscure error, which also implies a bug in user code, a panic wuld be better - we still catch the error, but don't introduce complications for users.
So let's drop PageSize type and perform panicking check in set_page_size methods.
PR description is out of date - it mentions PagingContinuation.
In checklist you marked that All commits compile, pass static checks and pass test. - is that really the case? I would expect commit session: make {query,execute} methods unpaged to break something if it really prevents usage of unpaged queries. If it doesn't break any test then it sounds like our tests are lacking.
I suggest a slightly different thing: let's make |
Good catch. Let's see if any test breaks. |
Makes sense |
Sad news... everything has passed. On a second thought, it's not strange that we haven't been testing improper behaviour. After all, |
a7aa56f to
6248a1c
Compare
|
v1.1: addressed all Karol's comments apart from PageSize panics, which is TODO. |
6248a1c to
0fa3118
Compare
f5b1839 to
a139ca5
Compare
|
v 1.2.1: mentioned that setters panic with non-positive numbers given. |
Before, a bare i32 was passed. Some API methods would accept nonpositive integers, while others would panic on such input. Both those kinds of behaviour are bad, and the inconsistency between them is even worse. For robustness the popular Rust tactics is employed, namely strong typing. PageSize's public constructor returns error for nonpositive integers, ensuring that is an instance of that type exists, its validity is guaranteed. I originally wanted to put burden of creating a correct PageSize and handle possible errors on user, but is would introduce a boilerplate (`try_into().unwrap()`) that is deemed to be unacceptable. Therefore, PageSize is kept pub(crate) and it's constructed by driver's code, panicking on invalid user's input. As page size is always chosen by user explicitly (not by some convoluted logic), such panics should appear early and be clear to users.
The value of everything being empty does not seem to be a reasonable default value for both QueryResult and ResultMetadata. It's better to have a mock_empty() constructor for the specific case when such special value is needed.
As an effort of more robust paging API, paging state is now made typed. Two new types are introduced: PagingState and PagingStateResponse. PagingState is, underneath, just an Arc over the raw bytes from the DB. The idea is that PagingState is passed as an argument to request execution API to ask for resuming the query (_continue it_) from the given PagingState. PagingStateResponse is returned always by the DB from paged requests - with information about either more or no more pages available. PagingState can be easily retrieved from PagingStateResponse (if it contains some), so that user can use it in a loop. The select_paging example is modernised to showcase the idea.
As users' experience showed that it optional page size on statements is error-prone, it is made mandatory. This means that at the moment no unpaged queries are possible. However, in the next commits it is going to be brought back again.
Some methods on Connection return QueryResponse instead of QueryResult. To make those methods stand out, they get "_raw" particle put into their names.
Without a clear reason, Connection::execute_raw would take PreparedStatement by value, involving a clone. The signature was changed to accept a shared reference and its usages are adjusted.
These methods are analogous to `Session`'s `{query,execute}`. Similarly,
they don't accept a non-start PagingState.
The logic of {query,execute} is extracted to {query_inner,execute_inner}
pub(crate) methods, respectively. {query,execute}_paged are unchanged.
{query,execute} now are unpaged unconditionally, i.e. they ignore the
page size set on a statement and pass None to the connection layer.
In the next commit, both {query,execute} are appended the `_unpaged`
suffix to explicitly state that they perform unpaged queries.
In order to make it explicit that the requests using those methods are performed without paging, their names are adjusted.
As these methods are analogous to those on Session, they are made unpaged in a manner similar to those. Inthe next commit, both all of them are appended the `_unpaged` suffix to explicitly state that they perform unpaged queries.
In order to make it explicit that the requests using those methods are performed without paging, their names are adjusted.
In no case for internal queries should we fetch only one page and ignore possible further ones. As both queries that were using query_single_page would return only one row anyway, the change should not affect semantics after all.
The name %_paged has long confused users. The new name indicates explicitly that only a single page is fetched with a single call to those methods.
As an attempt to make paged queries' API more explicit, robust and
self-explanatory, PagingState is decoupled from QueryResult. Instead,
it's returned from {query,execute}_single_page as a second field of a
pair, helping users not to forget about it.
Moreover, internal and external APIs which expect that the query is not
paged now issue an error tracing message and return ProtocolError error
if a "more pages" page state is returned from the unpaged query.
It appears that query_single_page's previous semantics was to only fetch the first page and ignore others. It makes much more sense, though, to have its semantics consistent with Session::query_single_page: to take PagingState and query a single page pointed by that state. This allows using it in a loop, similarly to Session::query_single_page.
As {query,execute} methods got the _unpaged prefix, now the inner
methods can drop the _inner prefix. This gives us one subtle advantage:
users now see "method {query,execute} is private" instead of
"{query,execute} method not found", which makes them more likely to
think that the API was changed on purpose. Assuming that users see
docstrings of private methods when they hover a call to them, this also
lets them read the new docstrings of {query,execute} that explain the
reasons behind the change and point to proper public methods:
{query,execute}_{unpaged,single_page,iter}.
After the API changes regarding paging, docstrings are updated with those changes in mind.
a139ca5 to
a1286de
Compare
|
v1.2.2: fully removed |
Motivation
Query paging API has shown to be unclear for users:
{query,execute}_pagedhad misleading name (query_paged is a misleading name #940),paging_statewas a magicBytesinstance (Make paging state strongly typed #987),{query,execute}would silently return only the first page if page size was specified on a statement,{query,execute}as the methods with the simplest names, even though they should be used with SELECTs with caution:paging_state, as it was just a field onQueryResult,{query,execute}_paged.This PR seeks to address those problems before we attack the Giant Request Execution Refactor™ (GRER™) (#978).
What's done
The core idea incorporated:
Shift responsibility about whether to page or not from statements (by setting
page_sizetoSomeorNone) toSession'smethods.Separate paged and unpaged API promotes conscious choice by users and thus prevents API misuse.
Detailed changes
PageSizeis introduced, which restricts the inneri32's range to positive integers. This is important, because passing nonpositive integer as a page size silently makes the query unpaged. All APIs now acceptPageSize, and it's user's responsibility to create aPageSize, handling possible errors thrown upon its construction.PagingStateandPagingStateResponse.PagingStateis, underneath, just anOption<Arc>over the raw bytes from the DB. The idea is thatPagingStateis passed as an argument to request execution API to ask for resuming the query (continuing it) from the point specified by givenPagingState.PagingStateResponseis returned always by the DB from paged requests - with information about either more or no more pages being available.PagingStateResponseis easily convertible intoPagingState, so that user can use it in in a loop.PageSizeis made mandatory on statements. Henceforth, paged API will always use it and unpaged API will always ignore it.{query,execute}methods have_unpagedprefix appended and they always perform an unpaged query (ignoring thePageSizespecified on a statement).{query,execute}_pagedare renamed to{query,execute}_single_page. The name%_pagedhas long confused users. The new name indicates explicitly that only a single page is fetched with a single call to those methods.{query,execute}_single_pagepass page state explicitly. As an attempt to make paged queries' API more explicit, robust and self-explanatory,PagingStateResponseis decoupled fromQueryResult. Instead, it's returned from{query,execute}_single_pageas a second field of a pair, helping users not to forget about it.Moreover, internal and external APIs which expect that the query is not paged now warn if a "more pages" page state is returned from a query.
select-pagingexample is modified to showcase the suggested way to use{query,execute}_single_page.Fixes: #940
Fixes: #987
Pre-review checklist
[ ] I added relevant tests for new features and bug fixes../docs/source/.Fixes:annotations to PR description.