Part 3: Raft
In this part, you will implement the Raft consensus algorithm. Raft maintains a replicated log across a cluster of servers, ensuring that all servers agree on the same sequence of commands even when some servers crash or become partitioned.
Read the Raft paper thoroughly before writing any code. Sections 5 and 6, and Figure 2 are most important for this homework.
Your implementation goes in src/kv_raft/raft.rs.
Raft interface
The starter code defines three public methods. Please do not change these signatures as they are required for testing.
/// Returns (current_term, is_leader).
pub fn get_state(&self) -> (u64, bool);
/// Submits a command to the leader's log. Returns `Some((index, term))`
/// if this peer is the leader, or `None` if not.
/// Index is 1-based.
pub fn submit(&self, command: &str) -> Option<(usize, u64)>;
/// Checks if the entry at 1-based index has been committed.
/// Returns the command string if committed, None otherwise.
pub fn get_committed(&self, index: usize) -> Option<String>;
Your Raft must also implement the Server trait to handle RequestVote and AppendEntries RPCs from peers.
Leader election
Implement Raft leader election and heartbeats.
Tasks
Raft state. Add fields to the
Raftstruct for the state described in Figure 2 of the paper. You will need to consider synchronisation here, e.g. withMutexorRwLock. The public Raft methods above are synchronous, so any state they lock must be accessible without.await.RequestVoteRPC. Define the needed structs. Implement the handler per Figure 2. Remember to deriveSerializeandDeserialize.AppendEntriesRPC. Define the needed structs structs. For now, the handler only needs to process heartbeats (no log entries yet).Election timeout. Spawn a background task that triggers an election if no heartbeat is received within a randomized timeout. Use
tokio::spawnandtokio::time::sleep. You should use randomised election timeouts, in a range of 200-400ms. If you poll for timeout expiration in a loop, keep each sleep in the 10-50ms range. The tests require a new leader to be elected within 5 seconds of the old leader failing.Heartbeats. When a server becomes leader, it should periodically send empty
AppendEntriesRPCs to all peers to maintain authority. A heartbeat interval around 50ms is reasonable.Implement
get_state. Before implementing this, otherwise your tests will fail.
Hints
- Raft peers should communicate only via RPC (through
self.peers). You should useself.peers[i].call::<Args, Reply>("request_vote", &args).awaitto send RPCs, and recall from previous parts thatcallreturns anOption. - You must send
RequestVoteRPCs in parallel when you start an election. Start elections by incrementing the term, voting for yourself, and sending RequestVote RPCs to all peers in parallel. You will findtokio::spawnuseful.
Testing
cargo test --test raft_test test_initial_election -- --test-threads=1
cargo test --test raft_test test_re_election -- --test-threads=1
cargo test --test raft_test test_many_elections -- --test-threads=1
Log replication
Implement log entry appending and commitment.
Tasks
submitandget_committed. If this server is the leader, append the command to the log and returnSome((index, term)), whereindexis 1-based. If not the leader, returnNone.Update
AppendEntriesto work with log entries. The leader maintainsnextIndexandmatchIndexfor each peer (Figure 2) and sends entries that the follower is missing.Follower log handling. Followers must check
prevLogIndexandprevLogTermas described in Figure 2. On conflict, reject the AppendEntries so the leader can back up.Commitment. The leader commits an entry when a majority of servers have replicated it.
Election restriction. See Section 5.4.1 of the paper.
Note that logs are 1-based. You may use index 0 as a dummy entry. submit should return Some((index, term)) with a 1-based index.
Hints
- You should never hold a lock across an RPC call.
- When the leader discovers a follower’s log is behind, it needs to back up
nextIndex. You may just decrement by one here, though a faster approach is outlined on pages 7-8 of the paper (with a grey vertical line).
Testing
cargo test --test raft_test test_basic_agree -- --test-threads=1
cargo test --test raft_test test_follower_failure -- --test-threads=1
cargo test --test raft_test test_leader_failure -- --test-threads=1
cargo test --test raft_test test_fail_no_agree -- --test-threads=1
cargo test --test raft_test test_fail_agree -- --test-threads=1
cargo test --test raft_test test_concurrent_starts -- --test-threads=1
cargo test --test raft_test test_rejoin -- --test-threads=1
cargo test --test raft_test test_backup -- --test-threads=1
cargo test --test raft_test test_partition -- --test-threads=1
Extra Credit: Persistence
Implement state persistence so that Raft servers can recover after crashes.
Tasks
Identify persistent state. Figure 2 specifies which state must be persisted:
currentTerm,votedFor, andlog.Persist on every change. Whenever any of these values change, serialize them and save to the persister.
Restore on startup. In
Raft::new, read fromself.persister.read(). If the string is non-empty, deserialize and restore the state.
Hints
- You may find that creating a helper method to be useful.
serde_json::to_stringandserde_json::from_strare the easiest way to serialize/deserialize.- The
Persisterwrites atomically (write to temp file, then rename), so you don’t need to worry about partial writes.
Testing
cargo test --test raft_test test_persist -- --test-threads=1
cargo test --test raft_test test_figure8 -- --test-threads=1
cargo test --test raft_test test_reliable_churn -- --test-threads=1
cargo test --test raft_test test_unreliable_churn -- --test-threads=1
To run all Raft tests:
cargo test --test raft_test -- --test-threads=1
The full Raft test suite should complete in under 120 seconds.