Background
Table of contents
Code
.
├── Cargo.toml
├── lib/ # Library (please don't modify anything here)
│ ├── src/
│ │ ├── rpc.rs
│ │ └── persister.rs
│ └── Cargo.toml
├── src/
│ ├── lib.rs
│ ├── lock.rs # Part 2 & 4: Distributed lock
│ ├── kv_single/
│ │ ├── server.rs # Part 1: KV server
│ │ └── client.rs # Part 1: KV client
│ └── kv_raft/
│ ├── raft.rs # Part 3: Raft consensus
│ ├── rsm.rs # Part 4: Replicated state machine
│ └── client.rs # Part 4: Cluster client
└── tests/
├── kvsrv_test.rs # Part 1-2 tests
├── raft_test.rs # Part 3 tests
├── kvraft_test.rs # Part 4 tests
└── common/
Files you will modify
| File | Part | What to implement |
|---|---|---|
src/kv_single/server.rs | 1 | KV server with versioned puts |
src/kv_single/client.rs | 1 | Client with retry logic |
src/lock.rs | 2, 4 | Distributed lock using KV |
src/kv_raft/raft.rs | 3 | Raft consensus algorithm |
src/kv_raft/rsm.rs | 4 | Replicated state machine |
src/kv_raft/client.rs | 4 | Cluster-aware client |
Files you should read but not modify
| File | Description |
|---|---|
lib/src/rpc.rs | RPC framework, Server trait, and RpcClient |
lib/src/client.rs | KV client types (Version, KVError, KvClient) |
lib/src/persister.rs | Crash-safe file persistence |
tests/common/ | Test infrastructure (network proxy, cluster management) |
RPC framework
Your servers and clients communicate over HTTP using a simple RPC framework defined in lib/src/rpc.rs.
Server trait
Every server implements the Server trait:
#[async_trait]
pub trait Server: Send + Sync {
async fn dispatch(&self, name: &str, args: String) -> String;
}
The framework routes POST /:method HTTP requests to dispatch(method, body). Your server deserializes the JSON body, processes the request, and returns a JSON response.
RpcClient
Clients send RPCs using RpcClient:
pub struct RpcClient { /* ... */ }
impl RpcClient {
pub fn new(proxy_url: String, client_id: usize) -> Self;
/// Send an RPC. Returns None on any network failure.
pub async fn call<A: Serialize, R: DeserializeOwned>(
&self, method: &str, args: &A,
) -> Option<R>;
}
call returns None when the network drops the request or reply. Your client must handle this.
Defining RPCs
There is no .proto file or code generation. You define your own RPC types as Rust structs with Serialize and Deserialize:
#[derive(Serialize, Deserialize)]
pub struct GetArgs {
pub key: String,
}
#[derive(Serialize, Deserialize)]
pub struct GetReply {
pub value: String,
pub version: Version,
pub err: Option<KVError>,
}
Your dispatch method pattern-matches on the RPC name and uses serde_json to deserialize/serialize:
async fn dispatch(&self, name: &str, args: String) -> String {
match name {
"get" => {
let a: GetArgs = serde_json::from_str(&args).unwrap();
serde_json::to_string(&self.get(&a)).unwrap()
}
// ...
}
}
KV types
The library provides shared types for the KV service:
Version
pub type Version = u64;
Each key has a version number. Version 0 means the key does not exist. The first successful put sets it to 1, and it increments on each subsequent successful put.
KVError
pub enum KVError {
NoKey, // Key does not exist
Version, // Put's version didn't match server's version
Maybe, // Client retried; the put may or may not have applied
}
The successful case is represented by Ok(...); there is no KVError variant for success. In Part 4, the RSM may also return an internal ReplyErr::WrongLeader over RPC so the cluster client can try another server, but that transient error should not be returned through the public KvClient trait.
KvClient trait
#[async_trait]
pub trait KvClient: Send + Sync {
async fn get(&self, key: &str) -> Result<(String, Version), KVError>;
async fn put(&self, key: &str, value: &str, version: Version) -> Result<(), KVError>;
}
Both the single-server client (Part 1) and the Raft client (Part 4) implement this trait. The Lock (Part 2, 4) accepts any Arc<dyn KvClient>, so the same lock code works with either backend.
Persister
lib/src/persister.rs provides crash-safe file persistence for Raft state:
pub struct Persister { /* ... */ }
impl Persister {
pub fn new(path: PathBuf) -> Self;
pub fn save(&self, data: &str); // Atomic write (write-tmp + rename)
pub fn read(&self) -> String; // Returns "" if file does not exist
}
You will use this in Part 3 to persist Raft state (term, vote, log) so that servers can recover after crashes.
Network simulation
The test harness interposes a network proxy between clients and servers. The proxy can:
- Drop requests: simulates request loss
- Drop replies: simulates reply loss (server processed the request, but client doesn’t know)
- Add delays: simulates network latency
- Block specific clients: simulates network partitions
Tests run in two network modes:
- Reliable (
NetworkConfig::reliable()): reliable mode with no drops or delay - Unreliable (
NetworkConfig::unreliable()): unreliable mode with 50% drop rate for both requests and replies and random delays
Your implementation must handle both correctly.