Chapter 11
Cluster Membership

The membership of a cluster refers to the set of daemons managed by a common coordinator. Cluster membership is dynamic and varies as daemons join and leave the cluster. The membership is split into two groups: online daemons that actively service client requests; and offline daemons which are known to the coordinator, but not configured to serve client requests. Daemons may categorized as offline for a variety of reasons, including planned maintenance, process failure, and network partitions.

This chapter describes how HyperDex manages cluster membership. Section 11.1 introduces cluster membership in the form of a short tutorial. Section 11.2 details the state machine the coordinator uses internally each daemon. Finally, Section 11.3 provides a reference-style guide to the commands available for manually managing cluster membership.

11.1 Tutorial

11.2 State Transitions

Unknown
The daemon is not known to the coordinator. It either has not yet associated with the coordinator, or has been explicitly disassociated from the coordinator.

Category: offline

Assigned
The daemon has registered with the coordinator, but has taken no further action. Once initialized, the daemon will transition from this state to Available. Daemons should only be in this state for a short period of time during the first time they are started. If a daemon is in this state for an extended period of time, it indicates that something has likely gone wrong.

Category: offline

Not Available
The daemon is in an error state where it cannot serve requests. Daemons that are Not Available will be temporarily unmapped from the hyperspace until the become Available again, at which point they will reintegrated into the hyperspace. Not Available is usually a temporary condition that the daemon will rectify when it comes back online. Should the daemon be permanently Not Available, it may be desirable to kill and forget the daemon.

Category: offline

Available
The daemon is fully-operational and accepting client requests. Daemons always work with the coordinator to try converging to this state in accordance with Figure 11.1.

Category: online

Shutdown
The daemon cleanly shutdown, usually in response to a SIGINT or SIGTERM, and the coordinator acknowledged the clean shutdown.

Category: offline

Killed
The daemon was explicitly killed by the administrator. Killed is permanent, and cannot be undone. Daemons should self-terminate upon transitioning to Killed.

Category: offline

Figure 11.1 shows the state machine that the coordinator adheres to when changing members’ states.


SVG-Viewer needed.


Figure 11.1: XXX

1.
Registration
2.
Post Registration
3.
Start
4.
Stop
5.
Alive
6.
Dead
7.
Kill
8.
Forget

11.3 Command Reference