Skip to content

feat(fdb): add FoundationDB forkless metadata backend#844

Draft
ralcolea wants to merge 12 commits into
devfrom
fdb-backend-for-single-master
Draft

feat(fdb): add FoundationDB forkless metadata backend#844
ralcolea wants to merge 12 commits into
devfrom
fdb-backend-for-single-master

Conversation

@ralcolea
Copy link
Copy Markdown
Contributor

Add a new FoundationDB-based metadata backend (MetadataBackendForkless) that stores filesystem metadata in FDB instead of the metadata.sfs file, removing the need to fork the master process during metadata dumps.

Motivation

The current file-based backend forks the master process to snapshot metadata, which blocks multi-threading improvements and adds latency on large instances. This PR introduces an alternative backend that persists metadata directly to FoundationDB, enabling synchronous dumps without forking.

What's included

This PR implements the core infrastructure and five of the eight metadata sections. The remaining three sections (ACLS, QUOT, FLCK) will follow in a separate PR to keep this one reviewable.

Core infrastructure

  • MetadataBackendForkless: new IMetadataBackend implementation that reads and writes metadata via FDB transactions.
  • MetadataWriterFDB: batched event queue that accumulates metadata mutations in memory and flushes them periodically (1 s) or on demand, reducing FDB transaction overhead.
  • MetadataSectionBootstrapFDB: bootstrap mechanism that initializes FDB from metadata.sfs when sections are missing (e.g. upgrades from legacy versions).
  • Metadata header validation: META_HEADER key in FDB for signature checks, matching the file backend's header handling.
  • METADATA_BACKEND_TYPE config option: new sfsmaster.cfg setting (FILE | FORKLESS) under the EXPERIMENTAL section.

Metadata sections

Section Key format Signal integration
CHNK CHNK_<chunkId> gChunkChangedSignal
NODE NODEL_<inodeId> nodeChangedSignal / nodeRemovedSignal
EDGE EDGE_<parentId><name> edgeChangedSignal / edgeRemovedSignal
FREE FREE_<inodeId> inodePool free/acquire signals
XATR XATR_<inodeId><name> gXAttrChangedSignal / gXAttrRemovedSignal / gXAttrInodeRemovedSignal

Persistence flow

  1. Filesystem operations emit signals on metadata mutations.
  2. Signal handlers enqueue typed IMetadataUpdateEvent objects into the writer.
  3. The writer flushes batches to FDB via read-write transactions.
  4. fs_storeall() flushes all pending updates, then persists restore-critical keys (META_NEXT_CHUNK_ID, META_VERSION, etc.) in a single FDB transaction.

What's NOT included (next PR)

  • ACLS 1.2 — ACL metadata section
  • QUOT 1.1 — Quota metadata section
  • FLCK 1.0 — File lock metadata section

Testing

Tested with the existing shadow synchronization and short system tests using METADATA_BACKEND_TYPE=FORKLESS. The FDB test infrastructure starts a local FoundationDB cluster for each test run.

Not all tests pass yet, for two reasons:

  1. Partial metadata coverage: ACLs, quotas, and file locks are not yet persisted to FDB. Tests exercising these sections will fail.
  2. FILE backend assumptions: Some tests depend on metadata.sfs file presence or fork-based dump behavior (e.g. reading dump files directly). These are structurally incompatible with the FORKLESS backend and will require separate test adaptations.

Notes

  • The FORKLESS backend is experimental and not suitable for production use.

@ralcolea ralcolea self-assigned this Apr 22, 2026
@ralcolea ralcolea requested a review from Copilot April 22, 2026 07:46
@ralcolea
Copy link
Copy Markdown
Contributor Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces an experimental 'FORKLESS' metadata backend using FoundationDB, enabling metadata persistence without forking the master process. The changes include a batched metadata writer, a bootstrapper for migrating existing data, and signal hooks for tracking node and xattr updates. Feedback identifies a critical configuration key mismatch that prevents backend activation, significant code duplication in the initialization logic, and potential performance issues due to heap allocations in xattr signals. Additionally, placeholder methods should use consistent logging levels to avoid misleading error reports.

Comment thread src/master/init.h
Comment thread src/master/init.h
Comment thread src/master/filesystem_xattr.cc Outdated
Comment thread src/master/metadata_backend_forkless.cc
Comment thread src/master/metadata_backend_forkless.cc
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an experimental FoundationDB-backed, forkless metadata backend to persist master metadata in a KV store (instead of metadata.sfs), enabling synchronous dumps without forking and introducing bootstrapping from the legacy file when needed.

Changes:

  • Introduce MetadataBackendForkless with FDB transactions + periodic/batched persistence via MetadataWriterFDB.
  • Add an FDB bootstrap loader for several metadata sections (NODE/EDGE/FREE/XATR/CHNK) from metadata.sfs.
  • Update master init/config and test tooling to support selecting and running the forkless/FDB-backed mode.

Reviewed changes

Copilot reviewed 21 out of 21 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
tests/tools/saunafs.sh Extends test harness to handle FORKLESS backend setup and FDB cluster configuration.
tests/tools/foundationdb.sh Improves FDB test cluster startup variable scoping and disowns background fdbmonitor.
src/master/metadata_writer_fdb.h Defines update-event types and the batched FDB metadata writer API.
src/master/metadata_writer_fdb.cc Implements event serialization and batched flush/commit logic to the KV store.
src/master/metadata_section_bootstrap_fdb.h Declares bootstrapper that loads missing FDB sections from metadata.sfs.
src/master/metadata_section_bootstrap_fdb.cc Implements bootstrapping for NODE/EDGE/FREE/XATR/CHNK sections into FDB.
src/master/metadata_backend_interface.h Extends backend interface with getHeaderSignature() for signature validation.
src/master/metadata_backend_forkless.h Declares new forkless backend, loader/saver, and signal→event wiring.
src/master/metadata_backend_forkless.cc Implements forkless backend lifecycle, loading, saving, flushing, and signal hookups.
src/master/metadata_backend_file.h Adds stub getHeaderSignature() for file backend implementation.
src/master/matoclserv.cc Adjusts admin “save metadata” reply behavior to handle synchronous forkless dumps.
src/master/masterconn.cc Skips downloading metadata.sfs when using the forkless backend and continues with changelogs.
src/master/kv_common_keys.h Adds META_HEADER key constant and documents META_NEXT_CHUNK_ID.
src/master/init.h Adds config-driven selection between FILE and FORKLESS metadata backends.
src/master/filesystem_xattr.h Introduces xattr change/removal signals for KV-backed persistence.
src/master/filesystem_xattr.cc Emits xattr change/removal signals on updates/removals.
src/master/filesystem_checksum.cc Emits nodeChangedSignal from checksum updates to drive KV persistence.
src/master/chunks.h Exposes chunk_get_next_id() for persisting restore-critical next-chunk-id key.
src/master/chunks.cc Implements chunk_get_next_id() wrapper.
src/master/CMakeLists.txt Links fdb into master library when ENABLE_FOUNDATIONDB is enabled.
src/data/sfsmaster.cfg.in Documents FDB_CLUSTER_FILE and new metadata backend selection option.

Comment thread src/master/metadata_section_bootstrap_fdb.h
Comment on lines +374 to +382
while (true) {
inode_t inode{};
getINode(&ptr, inode);
uint8_t attributeNameLength = get8bit(&ptr);

uint32_t attributeValueLength{};
get32bit(&ptr, attributeValueLength);
marker->offset = metadataFile_->offset(ptr);

Copy link

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

loadXAttrSection() reads inode/name/value lengths in a while(true) loop without bounding reads by the XATR section end. If metadata.sfs is truncated or corrupt, this can advance ptr past the section (or even past the mapped file) and crash. Please add a sectionEnd calculation (marker->offset + marker->length) and validate that ptr + attributeNameLength + attributeValueLength stays within bounds before constructing vectors/advancing ptr.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice suggestion 👍
I will consider to apply it for the rest of the sections.

Comment thread src/master/metadata_writer_fdb.cc Outdated
Comment thread src/master/metadata_backend_forkless.cc Outdated
Comment thread src/master/metadata_writer_fdb.h Outdated
Comment thread src/master/init.h
Comment thread src/master/CMakeLists.txt
Comment thread src/master/filesystem_checksum.cc Outdated
Comment on lines +112 to +115
// All nodes changes call fsnodes_update_checksum(), so triggering nodeChangedSignal here
// ensures that any node change is tracked, which is necessary for keeping FDB nodes's state up
// to date and consistent with the in-memory state of the nodes.
gMetadata->nodeChangedSignal.emit(node);
Copy link

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Emitting nodeChangedSignal from fsnodes_update_checksum() changes semantics globally: this function is called very frequently (including inside transactions and checksum recalculation paths), and existing code deliberately suppresses nodeChangedSignal in some transactional contexts. This will likely cause excessive/duplicated persistence events for the forkless backend and may break ordering/transaction assumptions. Consider keeping signal emission at the existing mutation points (or adding explicit hooks where needed) rather than unconditionally emitting here.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, this is a pending suggestion I need to address.

Comment thread src/master/metadata_backend_forkless.cc Outdated
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces an experimental "FORKLESS" metadata backend that stores filesystem metadata in FoundationDB, aiming to eliminate the need for forking the master process during metadata dumps. Key additions include the MetadataBackendForkless implementation, a bootstrapping tool for migrating data from metadata.sfs, and a batched MetadataWriterFDB for handling updates. Review feedback identifies critical issues regarding FoundationDB's 5-second transaction limit during full range scans, potential logic conflicts between backend initialization and bootstrapping, and several opportunities for performance optimization, such as using std::deque for the update queue and read-only transactions where appropriate.

Comment thread src/master/metadata_backend_forkless.cc Outdated
Comment thread src/master/metadata_backend_forkless.cc Outdated
Comment thread src/master/metadata_writer_fdb.h Outdated
Comment thread src/master/metadata_writer_fdb.cc Outdated
Comment thread src/master/metadata_backend_forkless.cc Outdated
Comment thread src/master/metadata_backend_forkless.cc Outdated
Comment thread src/master/metadata_backend_forkless.cc Outdated
ralcolea and others added 11 commits May 20, 2026 19:43
This commit introduces a new FDB-based backend to progressively migrate
metadata storage from file-based dumps to FoundationDB. This lays the
foundation for eliminating the fork-based dump process that currently
blocks multi-threading improvements in the master server.

Key changes:
- Add MetadataBackendForkless class implementing IMetadataBackend.
- Add chunk metadata persistence to FDB via gChunkChangedSignal.

Current limitations:
- Only CHNK 1.0 section is migrated to FDB.
- No batching mechanism for chunk updates yet.

Migration strategy:
Filesystem metadata sections will be migrated incrementally until all
metadata resides in FDB.

Co-authored-by: guillex <guillex@leil.io>
Signed-off-by: Crash <crash@leil.io>
This commit completes the migration of chunk metadata persistence
to FDB forkless backend.

Key changes:
- Move addFilesToChunks() to MetadataBackendForkless::loadall().
- Move checksum calculation to MetadataBackendForkless::loadall().
- Remove the fdbmonitor background process from the job control table
  to prevent the wait command from blocking during tests.

Test configuration:
Setup the testing framework to recognize ${metadata_backend}="FORKLESS"
to enable forkless backend in tests and start one FDB cluster.

Co-authored-by: guillex <guillex@leil.io>
Signed-off-by: Crash <crash@leil.io>
This commit implements MetadataWriterFDB class to batch and flush
metadata updates to FoundationDB, improving write performance and
reducing transaction overhead.

Key changes:
- Add MetadataWriterFDB class with event queue for batched writes.
- Implement IMetadataUpdateEvent interface for different metadata
  update types.
- Add ChunkUpdateEvent for handling chunk metadata updates.
- Register periodic flush callback (1s interval) using event loop.
- Replace direct FDB writes in chunk signal handler with enqueue
  of ChunkUpdateEvent operations.

The writer accumulates updates in memory and flushes them periodically
or on-demand, preserving changelog ordering while reducing the number
of FDB transactions performed to persist the filesystem metadata.

Co-authored-by: guillex <guillex@leil.io>
Signed-off-by: Crash <crash@leil.io>
This commit implements metadata header signature validation and
initialization for the FDB forkless backend to ensure compatibility with
different metadata versions.

Key changes:
- Add META_HEADER key constant for storing header signatures in FDB.
- Extend IMetadataBackend interface with getHeaderSignature() function
  and implement getHeaderSignature() in both file and forkless backends.
- Add checkMetadataSignature() function to validate metadata headers in
  forkless backend.
- Add initializeNewMetadataHeader() method to set proper metadata header
  signature when new empty metadata is detected.
- Use signal-based approach to trigger metadata header initialization.

This change enables the FDB backend to properly validate and initialize
metadata headers.

Co-authored-by: guillex <guillex@leil.io>
Signed-off-by: Crash <crash@leil.io>
Ensure consistent chunk ID allocation and restore behavior in the
FoundationDB backend by persisting the next chunk ID and improving
metadata flush guarantees.

Key changes:
- Persist META_NEXT_CHUNK_ID to track the next chunk ID across restarts
- Restore chunk ID generator state during metadata load
- Ensure full flush of pending updates before saving restore-critical keys
- Introduce flushAll() to guarantee complete persistence when required
- Batch metadata writer flushes to avoid long blocking operations
- Move chunk update signal registration to init() to cover changelog replay

Additional updates:
- Add configurable FDB_CLUSTER_FILE path to sfsmaster.cfg
- Minor cleanup and logging improvements
- Use safer key encoding helpers for chunk metadata
- Fix variable scoping in FoundationDB test helper script

Signed-off-by: Crash <crash@leil.io>
This commit introduces a metadata bootstrap mechanism for the forkless
backend that initializes required metadata sections directly from
metadata.sfs when they are missing.

This change was motivated by upgrade-related integration tests
(e.g. overwrite-file and undergoal-chunks scenarios from Longs suite)
that start from legacy SaunaFS versions without an FDB backend, create
files and chunks, and then upgrade to a version using an FDB backend.
In those cases, the legacy versions do not initialize any backend state,
causing chunk metadata to be missing after the upgrade and leading to
test failures.

A new MetadataSectionBootstrapFDB component scans the metadata file
during startup, detects which sections require bootstrapping, and
populates FDB accordingly. The initial implementation bootstraps the
CHNK 1.0 section, restoring chunk state into the backend.

Bootstrapping runs only on the master node and is skipped once the
required keys already exist in FDB. If bootstrapping succeeds, metadata
loading proceeds even when the on-disk metadata signature is not yet
valid, allowing seamless upgrades from legacy versions.

Signed-off-by: Crash <crash@leil.io>
This commit adds support for the Nodes metadata section to the FDB
metadata backend.

This change introduces persistence of filesystem nodes in FDB,
including:
- NODE_ keys storing the latest serialized FSNode state.
- Bootstrap logic to initialize the Nodes section when missing.

Co-authored-by: guillex <guillex@leil.io>
Signed-off-by: Crash <crash@leil.io>
This commit adds support for the Free metadata section in the
FoundationDB metadata backend.

This change introduces persistence of the inode free list into FDB
using FREE_<inode> keys, where the value stores the associated
timestamp.

Key changes:
- Introduce loadFree() function to load FREE entries from FDB into
  inodePool during startup.
- Signal integration with inodePool to persist free list changes in
  real time via FreeNodeUpdateEvent.
- FREE section bootstrap support to import entries from metadata.sfs
  file when initializing a new FDB backend.
- FreeNodeUpdateEvent implementation for adding and removing FREE_<inode>
  keys.

FREE entries are written when an inode is freed and removed when it is
allocated, keeping the FDB state consistent with the in-memory inode
free list.

Co-authored-by: guillex <guillex@leil.io>
Signed-off-by: Crash <crash@leil.io>
This commit adds support for the EDGE metadata section in the
FoundationDB metadata backend.

Edges represent directory relationships and are stored as:

  EDGE_<parentId><name>: <childId>

This change introduces full EDGE section support, including:
- loadEdges() to reconstruct the in-memory directory tree from
  EDGE_ keys at startup.
- loadEdge() to attach children to their parent directories, including
  trash/reserved handling and stats propagation.
- EdgeUpdateEvent and EdgeRemoveEvent to persist directory entry
  creations, renames, and removals in real time.
- Signal wiring (edgeChangedSignal / edgeRemovedSignal) to propagate
  metadata changes to FDB.
- EDGE section bootstrap support to import entries from metadata.sfs
  file when initializing a new FDB backend.
- Add missing documentation for existing functions.

Co-authored-by: guillex <guillex@leil.io>
Signed-off-by: Crash <crash@leil.io>
This commit adds support for persisting node deletions and metadata
global properties in the FDB backend.

Node removal support:
- Introduce nodeRemovedSignal in gMetadata.
- Emit nodeChangedSignal on checksum updates to ensure modifications
  are persisted.
- Implement NodeRemoveEvent to remove NODE_<inode> keys
- Wire nodeRemovedSignal to MetadataBackendForkless.

Metadata global properties:
- Persist maxInodeId, metadataVersion, and nextSessionId as dedicated
  META_* keys in FDB.
- Load these values during startup before section loading.
- Integrate persistence into fs_storeall().

fs_storeall() improvements:
- Flush pending updates before sealing restore keys.
- Persist metadata keys atomically.
- Rotate changelog and broadcast metadata_saved status.

Additional improvements:
- Handle metadata download flow correctly for forkless backend.
- Adjust synchronous dump behavior for forkless backend.
- Improve test setup for master/shadow configurations.

Co-authored-by: guillex <guillex@leil.io>
Signed-off-by: Crash <crash@leil.io>
Add support for the XATR metadata section in the FoundationDB backend,
including persistence, loading, and bootstrap from metadata files.

This introduces XAttr update and removal events in the metadata writer,
mapping in-memory xattr operations to KV entries using the
`XATR_<inode><name>` key format.

Hook xattr lifecycle events into the metadata backend via signals:
- emit change and removal events on xattr updates in the filesystem layer
- enqueue corresponding XAttrUpdateEvent, XAttrRemoveEvent, and
  XAttrInodeRemoveEvent operations in the FDB writer

Implement full loading support from FoundationDB by scanning the XATR_
keyspace and reconstructing in-memory xattr structures, with validation
against protocol limits.

Add bootstrap support to import XATR metadata from filesystem dumps
into FoundationDB during initialization.

Overall, this enables full round-trip support for xattr metadata in the
FoundationDB backend.

Co-authored-by: guillex <guillex@leil.io>
Signed-off-by: Crash <crash@leil.io>
@ralcolea ralcolea force-pushed the fdb-backend-for-single-master branch from bfa9236 to e1ba985 Compare May 20, 2026 17:43
Key changes:
- Gate forkless backend sources and initialization behind
  ENABLE_FOUNDATIONDB, and correct the metadata backend config key.

- Emit node, chunk, and xattr change notifications across filesystem
  operations so FDB-backed metadata stays in sync with in-memory state.

- Harden FDB bootstrap and load paths by persisting metadata header
  fields, detecting partially initialized stores, and using safer
  read-only transactions during loads.

- Rework metadata writer flushing to support ordered batched drains,
  restore failed batches, and handle bootstrap writes more safely.

Signed-off-by: Crash <crash@leil.io>
@ralcolea ralcolea force-pushed the fdb-backend-for-single-master branch from e1ba985 to 34a9b46 Compare May 22, 2026 15:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants