深入理解raft

Raft 是一种用来管理日志复制的一致性算法。

列举资料:

论文:https://pdos.csail.mit.edu/6.824/papers/raft-extended.pdf

动画:https://thesecretlivesofdata.com/raft/

go库:https://pkg.go.dev/go.etcd.io/etcd@v3.3.27+incompatible/raft

go版实现:https://github.com/etcd-io/raft

https://raft.github.io/

Raft 问题分解

  • leader election
  • log replication
  • safety
  • membership changes

Raft 角色

  • leader

    处理来自客户端的请求

  • candidate

    用来选举一个新的 leader

  • follower

    不发送任何请求,只响应 leader 和 candidate 发送的请求。

Raft 中的 RPC

RequestVote RPC

AppendEntries RPC

InstallSnapshot RPC

Leader 选举

服务器刚启动的时候, 都是 follower。

日志复制

Raft guarantees that committed entries are durable and will eventually be exe- cuted by all of the available state machines.

A log entry is committed once the leader that created the entry has replicated it on a majority of the servers。

Safety

Raft 使用了一种更简单的方式来保证 在新的领导人开始选举的时候 在之前任期的 所有已提交 的日志条目都会出现在上边,而不需要将这些条目传送给领导人。这就意味着日志条目只有一个流向:从领导人流向追随者。领导人永远不会覆盖已经存在的日志条目。

Raft uses the voting process to prevent a candidate from winning an election unless its log contains all committed entries.

The RequestVote RPC implements this restriction: the RPC includes information about the candidate’s log, and the voter denies its vote if its own log is more up-to-date than that of the candidate.

Raft never commits log entries from previ- ous terms by counting replicas. Only log entries from the leader’s current term are committed by counting replicas;