Update  libp2p_kad::store::RecordStore trait to be amenable to a persistent implementation

### Description

The [libp2p_kad](https://docs.rs/libp2p-kad/latest/libp2p_kad/index.html)::[store](https://docs.rs/libp2p-kad/latest/libp2p_kad/store/index.html)::[RecordStore](https://docs.rs/libp2p-kad/latest/libp2p_kad/store/trait.RecordStore.html#) trait currently has a design that makes it difficult to implement a persistent backend. 

There are challenges with the current design of the trait that make this difficult:

* An  [Instant](https://doc.rust-lang.org/stable/std/time/struct.Instant.html)  is used in the record types which prevents valid serialization/deserialization to a persistent store.
* The API assumes that all data fits easily in memory, i.e. doesn't have a way to batch or otherwise partition the set of records.
* The API does not have an async paradigm. The API does not follow either a poll model or a async/await model allowing for efficient system IO to a persistent store. 

#### Instant Serialization

Specifically the [ProviderRecord](https://docs.rs/libp2p-kad/latest/libp2p_kad/struct.ProviderRecord.html) and [Record](https://docs.rs/libp2p-kad/latest/libp2p_kad/struct.Record.html) types contain an [Instant](https://doc.rust-lang.org/stable/std/time/struct.Instant.html)  which by design cannot be serialized/deserialized.  


I suggest we change the time type used from Instant to [SystemTime](https://doc.rust-lang.org/stable/std/time/struct.SystemTime.html). The trade off is that SystemTime is not guaranteed to be monotonic, i.e the system clock can be modified and so a time that was expected to be in the future may not be. However its possible to serialize/desialize a SystemTime (i.e. using seconds since the Unix Epoch). Time scales typically involved in record expiration are typically hours, at this scale its uncommon to see changes in monotinicity of a SystemTime. 


#### Memory Pressure 

The [provided](https://docs.rs/libp2p-kad/latest/libp2p_kad/store/trait.RecordStore.html#tymethod.provided) method produces an iterator over all entries in the store. Without a mechanism to paginate or resume from a cursor in the store the iterator may block other concurrent requests to the underlying store (i.e. sqlite).

#### Async API

Its not clear from the trait API or docs how/if any async system IO can be performed efficiently by an implementer. Are methods called concurrently? If system IO blocks the current thread will that potentially create a deadlock in the calling code? A persistent implementation needs answers to these questions.


### Motivation

We have a use case where we are storing on the order of 100K - 10M provider records. Storing all of this data in memory is in efficient. Additionally we need the set of records to be persistent across restarts of the process.

Based on the design of the trait we have a few choices:

* Persist the data in a different store and populate the memory store on startup. This has challenges of keeping the two stores in sync and does not solve the memory pressure needs.
* Implement the RecordStore trait using a persistent store hacking around the Instant serialization problem and blocking the current thread on system IO. Its not clear what the performance impact of such a design would be.




### Current Implementation

The current implementation has one other limitation. While the `records` and `provided` methods return an iterator over the data, the iterator is immediately cloned/collected into a heap allocated vector. This means not only would we need to update the trait API but also update the consuming code to be memory efficient. 

### Are you planning to do it yourself in a pull request ?

Maybe

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update libp2p_kad::store::RecordStore trait to be amenable to a persistent implementation #4817

Description

Instant Serialization

Memory Pressure

Async API

Motivation

Current Implementation

Are you planning to do it yourself in a pull request ?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Update libp2p_kad::store::RecordStore trait to be amenable to a persistent implementation #4817

Description

Description

Instant Serialization

Memory Pressure

Async API

Motivation

Current Implementation

Are you planning to do it yourself in a pull request ?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions