Part 3: Raft

In this part, you will implement the Raft consensus algorithm. Raft maintains a replicated log across a cluster of servers, ensuring that all servers agree on the same sequence of commands even when some servers crash or become partitioned.

Read the Raft paper thoroughly before writing any code. Sections 5 and 6, and Figure 2 are most important for this homework.

Your implementation goes in src/kv_raft/raft.rs.

Raft interface

The starter code defines three public methods. Please do not change these signatures as they are required for testing.

/// Returns (current_term, is_leader).
pub async fn get_state(&self) -> (u64, bool);

/// Submits a command to the leader's log. Returns `Some((index, term))`
/// if this peer is the leader, or `None` if not.
/// Index is 1-based.
pub async fn submit(&self, command: &str) -> Option<(usize, u64)>;

/// Checks if the entry at 1-based index has been committed.
/// Returns the command string if committed, None otherwise.
pub async fn get_committed(&self, index: usize) -> Option<String>;

Your Raft must also implement the Server trait to handle RequestVote and AppendEntries RPCs from peers.

Leader election

Implement Raft leader election and heartbeats.

Tasks

Raft state. Add fields to the Raft struct for the state described in Figure 2 of the paper. You will need to consider synchronisation here, e.g. with Mutex or RwLock.
RequestVote RPC. Define the needed structs. Implement the handler per Figure 2. Remember to derive Serialize and Deserialize.
AppendEntries RPC. Define the needed structs structs. For now, the handler only needs to process heartbeats (no log entries yet).
Election timeout. Spawn a background task that triggers an election if no heartbeat is received within a randomized timeout. Use tokio::spawn and tokio::time::sleep. You should use randomised election timeouts, in a range of 200-400ms. If you poll for timeout expiration in a loop, keep each sleep in the 10-50ms range. The tests require a new leader to be elected within 5 seconds of the old leader failing.
Heartbeats. When a server becomes leader, it should periodically send empty AppendEntries RPCs to all peers to maintain authority. A heartbeat interval around 50ms is reasonable.
Implement get_state. Before implementing this, otherwise your tests will fail.

Hints

Raft peers should communicate only via RPC (through self.peers). You should use self.peers[i].call::<Args, Reply>("request_vote", &args).await to send RPCs, and recall from previous parts that call returns an Option.
You must send RequestVote RPCs in parallel when you start an election. Start elections by incrementing the term, voting for yourself, and sending RequestVote RPCs to all peers in parallel. You will find tokio::spawn useful.

Testing

cargo test --test raft_test test_initial_election -- --test-threads=1
cargo test --test raft_test test_re_election -- --test-threads=1
cargo test --test raft_test test_many_elections -- --test-threads=1

Log replication

Implement log entry appending and commitment.

Tasks

submit and get_committed. If this server is the leader, append the command to the log and return Some((index, term)), where index is 1-based. If not the leader, return None.
Update AppendEntries to work with log entries. The leader maintains nextIndex and matchIndex for each peer (Figure 2) and sends entries that the follower is missing.
Follower log handling. Followers must check prevLogIndex and prevLogTerm as described in Figure 2. On conflict, reject the AppendEntries so the leader can back up.
Commitment. The leader commits an entry when a majority of servers have replicated it.
Election restriction. See Section 5.4.1 of the paper.

Note that logs are 1-based. You may use index 0 as a dummy entry. submit should return Some((index, term)) with a 1-based index.

Hints

You should never hold a lock across an RPC call.
When the leader discovers a follower’s log is behind, it needs to back up nextIndex. You may just decrement by one here, though a faster approach is outlined on pages 7-8 of the paper (with a grey vertical line).

Testing

cargo test --test raft_test test_basic_agree -- --test-threads=1
cargo test --test raft_test test_follower_failure -- --test-threads=1
cargo test --test raft_test test_leader_failure -- --test-threads=1
cargo test --test raft_test test_fail_no_agree -- --test-threads=1
cargo test --test raft_test test_fail_agree -- --test-threads=1
cargo test --test raft_test test_concurrent_starts -- --test-threads=1
cargo test --test raft_test test_rejoin -- --test-threads=1
cargo test --test raft_test test_backup -- --test-threads=1
cargo test --test raft_test test_partition -- --test-threads=1

Extra Credit: Persistence

Implement state persistence so that Raft servers can recover after crashes.

Tasks

Identify persistent state. Figure 2 specifies which state must be persisted: currentTerm, votedFor, and log.
Persist on every change. Whenever any of these values change, serialize them and save to the persister.
Restore on startup. In Raft::new, read from self.persister.read(). If the string is non-empty, deserialize and restore the state.

Hints

You may find that creating a helper method to be useful.
serde_json::to_string and serde_json::from_str are the easiest way to serialize/deserialize.
The Persister writes atomically (write to temp file, then rename), so you don’t need to worry about partial writes.

Testing

cargo test --test raft_test test_persist -- --test-threads=1
cargo test --test raft_test test_figure8 -- --test-threads=1
cargo test --test raft_test test_reliable_churn -- --test-threads=1
cargo test --test raft_test test_unreliable_churn -- --test-threads=1

To run all Raft tests:

cargo test --test raft_test -- --test-threads=1

The full Raft test suite should complete in under 120 seconds.