Implementation of the practical project for Distributed Algorithms (EPFL, 2025). The project builds foundational communication abstractions for distributed systems under crash faults and an asynchronous, unreliable network.
- Transport: UDP point-to-point only. No TCP, no multicast/broadcast, no external reliability features.
- Socket usage: Each process uses a single UDP socket / single listening port for receiving.
- Batching: At most 8 application messages per UDP packet.
- Resource limits: 2 CPU cores, 4 GiB RAM, max 8 threads per process. Output file ≤ 64 MiB; console output may be truncated.
- Language & libs: C11/C++17, no third-party libraries.
- Grading environment: Ubuntu 22.04 with gcc/g++ 11.4.0 and cmake 3.22.1.
This repository implements the following building blocks:
Interface
- Request: ⟨pl, Send | q, m⟩: Sends message m to process q.
- Indication: ⟨pl, Deliver | p, m⟩: Delivers message m sent by process p.
Properties
- PL1: Reliable delivery: If a correct process p sends a message m to a correct process q, then q eventually delivers m.
- PL2: No duplication: No message is delivered by a process more than once.
- PL3: No creation: If some process q delivers a message m with sender p, then m was previously sent to q by process p.
Design Notes
-
Windowing (send-side back-pressure)
Bounds per-destination in-flight bundles to control memory, enable pipelining, and avoid overwhelming the network. -
Windowing (receive-side reordering buffer + in-order release + duplicate suppression)
Buffers a bounded out-of-order range of bundles per sender, releases only from the window head (in-order), and drops duplicates deterministically. -
Retransmission scheduling (timers + backoff)
Retransmits unacked bundles using per-destination deadlines (min-heap) with backoff + jitter to stay reliable under loss without causing resend bursts. -
Batching (≤ 8 msgs / UDP packet)
Improves throughput by amortizing UDP/syscall overhead while respecting the project’s batching constraint. -
Shared payload storage (reduced copies)
Provides a shared-bytes enqueue path to reduce payload copies on the sender side (especially beneficial for fanout-style usage). -
Zero-copy receive path (views)
Parses from received packet buffers and forwards slices to upper layers, minimizing allocations/copies on the hot path.
Interface
- Request: ⟨frb, Send | m⟩: Broadcasts a message m to all processes.
- Indication: ⟨frb, Deliver | p, m⟩: Delivers a message m broadcast by process p.
Properties
- FRB1: Validity: If a correct process p broadcasts a message m, then p eventually delivers m.
- FRB2: No duplication: No message is delivered more than once.
- FRB3: No creation: If a process delivers a message m with sender s, then m was previously broadcast by process s.
- FRB4: Uniform agreement: If a message m is delivered by some process (whether correct or faulty), then m is eventually delivered by every correct process.
- FRB5: FIFO delivery: If some process broadcasts message m₁ before it broadcasts message m₂, then no correct process delivers m₂ unless it has already delivered m₁.
Design Notes
-
Windowing (per-origin reordering buffer + FIFO gating + duplicate suppression)
Buffers out-of-order messages per origin and releases deliveries strictly in-order to satisfy FIFO, while suppressing duplicates. -
Ack/threshold tracking (majority)
Tracks distinct forwarders in a per-message ack bitset and delivers onceack_count >= (n/2 + 1). -
First-seen eager rebroadcast
On first observation of(origin, seq), stores payload and rebroadcasts the same encoded bytes to all peers to accelerate uniform agreement. -
Batch pumping from PL
Drains PL deliveries in batches to amortize lock/call overhead; this path uses materialized bytes (copy), not views. -
Shared payload storage (reduced copies)
PL supports shared-bytes fanout, but FIFO URB currently rebroadcasts withenqueue_data(dest, bytes)(copy-based per-destination enqueue).
Interface
- Request: ⟨la, Propose | Iᵢ⟩: A process proposes a set Iᵢ ⊆ V.
- Indication: ⟨la, Decide | Oᵢ⟩: A process decides a set Oᵢ ⊆ V.
Properties
- LA1: Validity: Let a process Pᵢ decide a set Oᵢ, then Iᵢ ⊆ Oᵢ and Oᵢ ⊆ ⋃ⱼ Iⱼ.
- LA2: Consistency: Let a process Pᵢ decide a set Oᵢ and let a process Pⱼ decide a set Oⱼ, then Oᵢ ⊆ Oⱼ or Oᵢ ⊃ Oⱼ.
- LA3: Termination: Every correct process eventually decides.
Design Notes
-
Windowing (proposer-side instance-level back-pressure)
Bounds the number of in-flight instances to control memory, enable bounded parallelism, and avoid overwhelming the network. -
ACK/NACK-driven refinement
Uses ACK/NACK feedback to decide or re-propose with expanded sets only when necessary, ensuring convergence. -
Windowing (acceptor bounded state + overflow)
Maintains a bounded active acceptor window for “current” instances while still answering older instances via an overflow map. -
Garbage collection (progress piggyback)
Frees old acceptor instance state once peers advance sufficiently, preventing unbounded accumulation. -
Shared payload storage (reduced copies)
Encodes once into shared-bytes and fans out to peers to reduce payload copies during propose. -
Batch pumping from PL (zero-copy views)
Drains PL view deliveries in batches to amortize lock/call overhead, and parses LA messages directly from view bytes to avoid cross-layer copies.
These abstractions are evaluated on correctness under:
- Network issues: packet delay, packet loss, packet reordering (no packet corruption)
- Process issues: slow processes and crashes
They are also evaluated on performance (aggregate throughput) without network/process issues.
Refer to the project description for details on how to execute the system, expected output format, logging requirements, and provided testing tools.