Merged
Conversation
a4c67ea to
12de756
Compare
4d7adae to
d2b6216
Compare
12 tasks
goodboy
commented
Oct 27, 2022
| datetime, # start | ||
| datetime, # end | ||
| ]: | ||
| if timeframe != 60: |
Contributor
Author
There was a problem hiding this comment.
So this is how we indicate that a brokerd can't deliver 1s OHLC.
goodboy
commented
Oct 27, 2022
| to_prepend = ohlcv[ohlcv['time'] < ts['Epoch'][0]] | ||
|
|
||
| profiler('Finished db arrays diffs') | ||
| # for secs in (1, 60): |
Contributor
Author
There was a problem hiding this comment.
Yeah just ignore most of this stuff since it's part of provisional tools we likely want for interactive tsdb mucking.
It relies on goodboy/tractor#306 which is still far from ready 😂
goodboy
commented
Oct 27, 2022
|
|
||
| godwidget.resize_all() | ||
|
|
||
| await link_views_with_region( |
Contributor
Author
There was a problem hiding this comment.
this factoring was mostly for sanity and because i don't even really get what layer this whole subsystem fits in 😂
Contributor
Author
|
Just pushed a bad timeframe wiper hack @guilledk |
Allow data feed sub-system to specify the timeframe (aka OHLC sample period) to the `open_history_client()` delivered history fetching API. Factor the data keycombo hack into a new routine to be used also from the history backfiller code when request latency increases; there is a first draft at trying to use the feed reset to speed up 1m frame throttling by timing out on the history frame response, but it needs a lot of fine tuning.
The `Store.load()`, `.read_ohlcv()` and `.write_ohlcv()` and `.delete_ts()` now can take a `timeframe: Optional[float]` param which is used to look up the appropriate sampling period table-key from `marketstore`.
Adjust all history query machinery to pass a `timeframe: int` in seconds and set default of 60 (aka 1m) such that history views from here forward will be 1m sampled OHLCV. Further when the tsdb is detected as up load a full 10 years of data if possible on the 1m - backends will eventually get a config section (`brokers.toml`) that allow user's to tune this.
Manual tinker-testing demonstrated that triggering data resets completely independent of the frame request gets more throughput and further, that repeated requests (for the same frame after cancelling on the `trio`-side) can yield duplicate frame responses. Re-work the dual-task structure to instead have one task wait indefinitely on the frame response (and thus not trigger duplicate frames) and the 2nd data reset task poll for the first task to complete in a poll loop which terminates when the frame arrives via an event. Dirty deatz: - make `get_bars()` take an optional timeout (which will eventually be dynamically passed from the history mgmt machinery) and move request logic inside a new `query()` closure meant to be spawned in a task which sets an event on frame arrival, add data reset poll loop in the main/parent task, deliver result on nursery completion. - handle frame request cancelled event case without crash. - on no-frame result (due to real history gap) hack in a 1 day decrement case which we need to eventually allow the caller to control likely based on measured frame rx latency. - make `wait_on_data_reset()` a predicate without output indicating reset success as well as `trio.Nursery.start()` compat so that it can be started in a new task with the started values yielded being a cancel scope and completion event. - drop the legacy `backfill_bars()`, not longer used.
It doesn't seem to be any slower on our least throttled backend (binance) and it removes a bunch of hard to get correct frame re-ordering logic that i'm not sure really ever fully worked XD Commented some issues we still need to resolve as well.
When we get a timeout or a `NoData` condition still return a tuple of empty sequences instead of `None` from `Client.bars()`. Move the sampling period-duration table to module level.
This allows the history manager to know the decrement size for `end_dt: datetime` on the next query if a no-data / gap case was encountered; subtract this in `get_bars()` in such cases. Define the expected `pendulum.Duration`s in the `.api._samplings` table. Also add a bit of query latency profiling that we may use later to more dynamically determine timeout driven data feed resets. Factor the `162` error cases into a common exception handler block.
Must have gotten left in during refactor from the `trimeter` version? Drop down to 6 years for 1m sampling.
Allows for easier restarts of certain `trio` side tasks without killing the `asyncio`-side clients; support via flag. Also fix a bug in `Client.bars()`: we need to return the duration on the empty bars case..
When a network outage or data feed connection is reset often the `ib_insync` task will hang until some kind of (internal?) timeout takes place or, in some (worst) cases it never re-establishes (the event stream) and thus the backend needs to restart or the live feed will never resume.. In order to avoid this issue once and for all this patch implements an additional (extremely simple) task that is started with the real-time feed and simply waits for any market data reset events; when detected restarts the `open_aio_quote_stream()` call in a loop using a surrounding cancel scope. Been meaning to implement this for ages and it's finally working!
Allows keeping mutex state around data reset requests which (if more then one are sent) can cause a throttling condition where ib's servers will get slower and slower to conduct a reconnect. With this you can have multiple ongoing contract requests without hitting that issue and we can go back to having a nice 3s timeout on the history queries before activating the hack.
Our default sample periods are 60s (1m) for the history chart and 1s for
the fast chart. This patch adds concurrent loading of both (or more)
different sample period data sets using the existing loading code but
with new support for looping through a passed "timeframe" table which
points to each shm instance.
More detailed adjustments include:
- breaking the "basic" and tsdb loading into 2 new funcs:
`basic_backfill()` and `tsdb_backfill()` the latter of which is run
when the tsdb daemon is discovered.
- adjust the fast shm buffer to offset with one day's worth of 1s so
that only up to a day is backfilled as history in the fast chart.
- adjust bus task starting in `manage_history()` to deliver back the
offset indices for both fast and slow shms and set them on the
`Feed` object as `.izero_hist/rt: int` values:
- allows the chart-UI linked view region handlers to use the offsets
in the view-linking-transform math to index-align the history and
fast chart.
Turns out querying for a high freq timeframe (like 1sec) will still return a lower freq timeframe (like 1Min) SMH, and no idea if it's the server or the client's fault, so we have to explicitly check the sample step size and discard lower freq series-results. Do this inside `Storage.read_ohlcv()` and return an empty `dict` when the wrong time step is detected from the query result. Further enforcements, - both `.load()` and `read_ohlcv()` now require an explicit `timeframe: int` input to guarantee the time step of the output array. - drop all calls `.load()` with non-timeframe specific input.
If a history manager raises a `DataUnavailable` just assume the sample rate isn't supported and that no shm prepends will be done. Further seed the shm array in such cases as before from the 1m history's last datum. Also, fix tsdb -> shm back-loading, cancelling tsdb queries when either no array-data is returned or a frame is delivered which has a start time no lesser then the least last retrieved. Use strict timeframes for every `Storage` API call.
Factor the multi-sample-rate region UI connecting into a new helper `link_views_with_region()` which reads in the shm buffer offsets from the `Feed` and appropriately connects the fast and slow chart handlers for the linear region graphics. Add detailed comments writeup for the inter-sampling transform algebra.
Not only improves startup latency but also avoids a bug where the rt buffer was being tsdb-history prepended *before* the backfilling of recent data from the backend was complete resulting in our of order frames in shm.
There never was any underlying db bug, it was a hardcoded timeframe in the column series write key.. Now we always assert a matching timeframe in results.
To make it easier to manually read/decipher long ledger files this adds `dict` sorting based on record-type-specific (api vs. flex report) datetime processing prior to ledger file write. - break up parsers into separate routines for flex and api record processing. - add `parse_flex_dt()` for special handling of the weird semicolon stamps in flex reports.
guilledk
approved these changes
Oct 29, 2022
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Super WIPReady for action,butjust a start at doing 1m long term (slow) chart alongside our classic 1s OHLC in the fast chart.This adjust our history loading in the data feed layer (
piker.data.feed) to do multi-time frame data loading concurrently and in a highly reliable manner such that both can be stored in the tsdb as well as explicitly queried, loaded and processed in shared mem arrays.ibrelatedSince
ibis the only currently supported backend with 1s OHLC history, this patch focuses around it but contains necessary adjustments to handle backends (like all the crypto$) which don't have this support (at least not without us writing our own sampler). When a backend doesn't have 1s OHLC history the fast chart simply starts empty and starts filling when thebrokerdfeed is first booted - during thepikerdparent's lifetime.Further enhancements in this backend include:
marketstorein like, <= 2mins 🥳 )ib_insyncibs api does..datetimefromClient.get_head_time()with anfqsninputstrand use this stamp as the earliest stamp allowed before raisingDataUnavailableto the history mgmt layerx-queries threshold where after 6 days worth of empty frame-results we presume the contract has no earlier history and we also raise aDataUnavailableThe summary of enhancements and bug fixes is more or less in the todo section below:
TODO: