Skip to content

Http caching & Conditional requests#831

Merged
XAMPPRocky merged 3 commits into
XAMPPRocky:mainfrom
pnevyk:http-caching
Dec 21, 2025
Merged

Http caching & Conditional requests#831
XAMPPRocky merged 3 commits into
XAMPPRocky:mainfrom
pnevyk:http-caching

Conversation

@pnevyk
Copy link
Copy Markdown
Contributor

@pnevyk pnevyk commented Dec 3, 2025

Context

This PR implements HTTP cache layer for the Octocrab client which uses conditional requests that do not count against the primary rate limit if a 304 Not Modified response is returned by GitHub API.

Example usage:

let octocrab = Octocrab::builder()
    .personal_token(token)
    // Any implementation of octocrab::service::middleware::cache::CacheStorage can be used
    .cache(octocrab::service::middleware::cache::mem::InMemoryCache::default())
    .build()?;

Implements #175.

Design

First and foremost, I am looking for feedback on the design. The implementation originates from my code which I hastily put together because I needed this feature for my use case and in-memory cache for good enough for me.

In #175 (comment) it is specified that this feature needs to be done in an IO-agnostic way and use a cache storage trait.

Main types are

pub enum CacheKey {
    ETag(String),
    LastModified(String),
}

pub trait CacheStorage: Send + Sync {
    fn try_hit(&self, uri: &Uri) -> Option<CacheKey>;
    fn load(&self, uri: &Uri) -> Option<CachedResponse>;
    fn writer(&self, uri: &Uri, key: CacheKey, headers: HeaderMap) -> Box<dyn CacheWriter>;
}

pub trait CacheWriter: Send + Sync {
    fn write_body(&mut self, data: &[u8]);
}
  • CacheStorage::try_hit – When a new request is made, the layer asks if there is an existing cache record (identified by either ETag or Last-Modified). If true, the corresponding cache header is added to the request.
  • CacheStorage::load – When a new response is received and is 304 Not Modified, then the cached response is loaded from the cache (assuming hit) and the response body is replaced with the cached body.
  • CacheStorage::writer – Returns a response Body wrapper that is supposed to write each received frame to the cache.

The CacheWriter::write_body is currently synchronous. I assume that this is not desired, but I don't have an experience with lower-level async code and I didn't know how to cleanly implement this:

impl<B> Body for WriteToCacheBody<B>
where
    B: Body<Data = Bytes, Error = crate::Error>,
{
    type Data = Bytes;
    type Error = crate::Error;

    fn poll_frame(
        self: Pin<&mut Self>,
        cx: &mut std::task::Context<'_>,
    ) -> Poll<Option<Result<Frame<Self::Data>, Self::Error>>> {
        let this = self.project();
        match this.inner.poll_frame(cx) {
            Poll::Ready(frame) => {
                if let Some(Ok(ref data)) = frame {
                    if let Some(data) = data.data_ref() {
                        // XXX: How to handle async state machine here???
                        this.writer.write_body(data);
                    }
                }

                Poll::Ready(frame)
            }
            Poll::Pending => Poll::Pending,
        }
    }
}

I am very much open to advice.

I also considered using the std::io::Write (or AsyncWrite) instead of custom CacheWriter, but I don't think that types already implementing this standard trait could be used because the writer likely needs to manage some cache-related state.

CacheStorage::load is also synchronous currently, but that one would be probably easy to make asynchronous to allow async loading from filesystem.

There is an ETag implementation in this library already. It's not used in my implementation, but maybe it should. My thought process was to always use the ETag value returned by response header as is in the request header.

Notes

I implemented only in-memory cache because it was the most straightforward. The original issue asks for filesystem storage. I am not sure if such implementation should be part of the library or should be implemented by the user. If it should be implemented in the library, it's not obvious to me what is the best implementation.

Only after my implementation I discovered the http-cache crate which might be used for the caching implementation but I am not sure how, because I am not really familiar with the ecosystem.

@dmgorsky
Copy link
Copy Markdown
Collaborator

could you please rebase on master @pnevyk

@pnevyk
Copy link
Copy Markdown
Contributor Author

pnevyk commented Dec 21, 2025

could you please rebase on master @pnevyk

@dmgorsky Done.

@XAMPPRocky
Copy link
Copy Markdown
Owner

Thank you for your PR!

@XAMPPRocky XAMPPRocky merged commit c08ad2e into XAMPPRocky:main Dec 21, 2025
10 checks passed
@github-actions github-actions Bot mentioned this pull request Dec 21, 2025
@pnevyk
Copy link
Copy Markdown
Contributor Author

pnevyk commented Dec 21, 2025

Thank you for your PR!

Do you think that the issues I mentioned in the PR description, in particular synchronous/blocking API of CacheWriter and only in-memory implementation, are fine? I also didn't include any test, just an example, because I wasn't sure if my implementation is good enough.

@XAMPPRocky
Copy link
Copy Markdown
Owner

I think they are fine enough to start with and if someone wants an async implementation, that can be made as a follow up.

I think in memory makes sense for what we can offer as we don't want to bless certain crates other people might want such as redis. A filesystem implementation would also be acceptable behind a feature flag but again that can be done in a follow up.

@pnevyk pnevyk deleted the http-caching branch December 21, 2025 10:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants