trckpd

SWIM - Scalable Weakly-consistent Infection-style Process Group Membership Protocol

distributed system

Consensus ?

In most simple term a Distributed System(DS) is a collection of computers that perform one task and puts up one interface or access point in front of the clients.

The automatic assumption from this is - all participating members in a DS will have same state at all time. This state is agreed upon by all members, or at least by majority/quorum. Thus the members form a consensus.

Why ?

The famous consensus protocols, Paxos and Raft implement strong consistencyof system state. Which implies that the correct state of the system is replicated across the members. Usually there is a leader, whose state is the agreed upon or correct state of the system. All modifications of system state happen via the leader. Replication of state machine is done via heart-beat/ping messages. This results in network load to grow at quadratic order of the group size [\(\begin{align*} \theta(n^2), n \end{align*}\) = group size] - this is a major scalability issue.

How ?

SWIM tries to resolve scalability issue by weakening the consistency requirement, it adopts eventual consistency. This implies that members can be at different state for a short (probabilistically bound) period of time, while tries to build consensus over a past state instead of current state. SWIM provides bound on the latency on convergence of state and group membership. SWIM makes crucial

and provides a deterministic bound on the time for

using an infection style message delivery instead of traditional heart-beat messaging. In traditional

Infection style consensus protocols address this scalability issue at expense of consistency. It weakens the consistency requirement by allowing group members to be at different state at any point of time and eventually build a consensus on a past state rather than on the current state.

Reference

Replicated State Machine

Strong consistency in parallel programming ensures all the participant entities see the same order of access. Therefore there is only one correct state of the system promising strong consistency.Wiki