Skip to main content Link Menu Expand (external link) Document Search Copy Copied

Part 4: Fault-Tolerant KV Service


In this last part, you will build a fault-tolerant key-value service by combining your Raft implementation with a replicated state machine (RSM). The service maintains identical KV databases across multiple servers and continues operating as long as a majority of servers are alive.

Architecture

The system has three layers:

  1. Client: Sends get/put RPCs to servers; retries on failure
  2. RSM: Submits operations to Raft, applies committed ops to KV state
  3. Raft: Replicates operations across the cluster

Each server runs an RSM that owns a Raft peer internally. The RSM receives KV RPCs from clients, submits them as commands to Raft, and applies committed commands to its local KV state. Since all servers apply the same commands in the same order, their KV states remain identical.

Replicated State Machine

Implement the RSM in src/kv_raft/rsm.rs.

The starter dispatch method currently forwards every RPC to the inner Raft. You will need to define the KV RPC request/reply types you need (for example GetArgs, GetReply, PutArgs, and PutReply), route "get" and "put" to your RSM handlers, and keep forwarding unknown RPCs to the inner Raft for peer communication.

Tasks

  1. Submit operations to Raft. When the RSM receives a get or put, serialize the operation and call self.raft.submit(...). If submit returns None, return ReplyErr::WrongLeader in the RPC reply.

  2. Wait for commitment. After submit succeeds, poll self.raft.get_committed(index) until the command at the returned index is committed. Use a loop with a short sleep, such as tokio::time::sleep(Duration::from_millis(10)); keep polling sleeps in the 10-50ms range.

  3. Apply committed operations. Maintain your own KV state, and when a committed command matches the one you submitted, apply it and return the result.

  4. Handle leadership changes. If the Raft leader changes while you are waiting for a commit (e.g., a different command appears at your index, or too much time passes), return ReplyErr::WrongLeader in the RPC reply so the client retries on another server.

All operations must go through the Raft log, including gets. A get that bypasses Raft might read stale data from a server that has been partitioned from the majority. Submitting gets to Raft ensures the server is still the leader at the time of the read. Every RSM peer must apply all committed log entries to its KV state in order, not just the ones it submitted, so the replicas stay in sync. When you see a committed entry that another peer submitted, apply it to your KV state but don’t return a result for it (no one on this server is waiting for it).

Hints

  • You need to encode your KV operations into the command string passed to raft.submit(). One approach could be to serialise the structs as JSON.
  • Be careful about the gap between submit and commitment. The leader can change, and a different command might be committed at the index you were given.

Client

Implement the cluster-aware client in src/kv_raft/client.rs. The client has an RpcClient for each RSM server.

Tasks

  1. Track the leader. Remember which server responded successfully last time and try it first.
  2. Retry on ReplyErr::WrongLeader. Try the next server.
  3. Retry on network failure. Try the next server.
  4. Handle Err(KVError::Maybe) correctly. Use the same logic as in Part 1

Unlike Part 1, the client now cycles through multiple servers. When a server returns ReplyErr::WrongLeader or the RPC fails, move on to the next server (wrapping around).

Testing

cargo test --test kvraft_test test_basic_kvraft -- --test-threads=1
cargo test --test kvraft_test test_concurrent_kvraft -- --test-threads=1
cargo test --test kvraft_test test_leader_failure_kvraft -- --test-threads=1
cargo test --test kvraft_test test_persist_kvraft -- --test-threads=1
cargo test --test kvraft_test test_partition_kvraft -- --test-threads=1
cargo test --test kvraft_test test_unreliable_kvraft -- --test-threads=1
cargo test --test kvraft_test test_many_partitions -- --test-threads=1
cargo test --test kvraft_test test_persist_concurrent_kvraft -- --test-threads=1
cargo test --test kvraft_test test_persist_partition_kvraft -- --test-threads=1
cargo test --test kvraft_test test_lock -- --test-threads=1