Skip to content
Open
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ members = [
"data-pdf",
"data-resource",
"fs-atomic-versions",
"fs-cache",
"fs-atomic-light",
"fs-metadata",
"fs-properties",
Expand All @@ -23,6 +24,7 @@ default-members = [
"data-pdf",
"data-resource",
"fs-atomic-versions",
"fs-cache",
"fs-atomic-light",
"fs-metadata",
"fs-properties",
Expand Down
7 changes: 3 additions & 4 deletions ark-cli/src/util.rs
Original file line number Diff line number Diff line change
Expand Up @@ -179,10 +179,9 @@ pub fn translate_storage(
root: &Option<PathBuf>,
storage: &str,
) -> Option<(PathBuf, Option<StorageType>)> {
if let Ok(path) = PathBuf::from_str(storage) {
if path.exists() && path.is_dir() {
return Some((path, None));
}
let Ok(path) = PathBuf::from_str(storage);
if path.exists() && path.is_dir() {
return Some((path, None));
}

match storage.to_lowercase().as_str() {
Expand Down
24 changes: 24 additions & 0 deletions fs-cache/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
[package]
name = "fs-cache"
version = "0.1.0"
edition = "2021"

[lib]
name = "fs_cache"
crate-type = ["rlib", "cdylib"]
bench = false

[dependencies]
log = { version = "0.4.17", features = ["release_max_level_off"] }
serde_json = "1.0.82"
serde = { version = "1.0.138", features = ["derive"] }
data-error = { path = "../data-error" }
data-resource = { path = "../data-resource" }
fs-storage = { path = "../fs-storage"}
linked-hash-map = "0.5.6"

[dev-dependencies]
anyhow = "1.0.81"
quickcheck = { version = "1.0.3", features = ["use_logging"] }
quickcheck_macros = "1.0.0"
tempdir = "0.3.7"
69 changes: 69 additions & 0 deletions fs-cache/src/cache.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
use data_error::Result;
use fs_storage::{base_storage::SyncStatus, monoid::Monoid};
use std::path::Path;

use crate::memory_limited_storage::MemoryLimitedStorage;

/// A generic cache implementation that stores values with LRU eviction in memory
/// and persistence to disk.
pub struct Cache<K, V> {
storage: MemoryLimitedStorage<K, V>,
}

impl<K, V> Cache<K, V>
where
K: Ord
+ Clone
+ serde::Serialize
+ serde::de::DeserializeOwned
+ std::fmt::Display
+ std::hash::Hash
+ std::str::FromStr,
V: Clone + serde::Serialize + serde::de::DeserializeOwned + Monoid<V>,
{
/// Create a new cache with given capacity
/// - `label`: Used for logging and error messages
/// - `path`: Directory where cache files will be stored
/// - `max_memory_items`: Maximum number of items to keep in memory
pub fn new(
label: String,
path: &Path,
max_memory_items: usize,
) -> Result<Self> {
let storage = MemoryLimitedStorage::new(label, path, max_memory_items)?;

Ok(Self { storage })
}

/// Get a value from the cache if it exists
/// Returns None if not found
pub fn get(&mut self, key: &K) -> Result<Option<V>> {
self.storage.get(key)
}

/// Store a value in the cache
/// Will persist to disk and maybe keep in memory based on LRU policy
pub fn set(&mut self, key: K, value: V) -> Result<()> {
self.storage.set(key, value)
}

/// Load most recent cached items into memory based on timestamps
pub fn load_recent(&mut self) -> Result<()> {
self.storage.load_fs()
}
Copy link
Member

@kirillt kirillt Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, we don't need to expose this function.

Only set/get API is needed

The rest should happen under the hood:

  • any set should write both to memory and disk
  • one-way sync from disk to memory is needed when users get values
  • if we hit our own limit for bytes stored in the in-memory mapping, we erase oldest entries from it
  • but entries are always stored on disk, no need to sync from memory to disk explicitly

Primary usage scenario: keys are of type ResourceId

  1. App indexes a folder.
  2. App may populate the cache before using it, but it's not required.
  3. App will query caches by key:
    • if the entry is in memory already, that's great, we just return the value
    • otherwise, we check disk for entry with the requested key
    • if it is on disk, we add it to in-memory storage and return the value
    • otherwise, we return None
  4. Index can notify the app about recently discovered resources. Corresponding values can be in the cache already, but this is not required. App can initialize values for new resources.

Secondary usage scenario: keys are of arbitrary type

Can be any deterministic computation.


/// Get number of items currently in memory
// pub fn memory_items(&self) -> usize {
// self.storage.memory_items()
// }

/// Get sync status between memory and disk
pub fn sync_status(&self) -> Result<SyncStatus> {
self.storage.sync_status()
}

/// Sync changes to disk
pub fn sync(&mut self) -> Result<()> {
self.storage.sync()
}
}
2 changes: 2 additions & 0 deletions fs-cache/src/lib.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
pub mod cache;
pub mod memory_limited_storage;
Loading
Loading