Observers
How ZooKeeper Observers enable scaling to large numbers of clients without hurting write performance, by using non-voting ensemble members that can be added freely without affecting quorum.
Although ZooKeeper performs very well by having clients connect directly to voting members of the ensemble, this architecture makes it hard to scale out to huge numbers of clients. The problem is that as we add more voting members, write performance drops. This is because a write operation requires the agreement of at least half the nodes in an ensemble, so the cost of a vote grows significantly as more voters are added.
We have introduced a new type of ZooKeeper node called an Observer which helps address this problem and further improves ZooKeeper's scalability. Observers are non-voting members of an ensemble which only hear the results of votes, not the agreement protocol that leads up to them. Other than this simple distinction, Observers function exactly the same as Followers — clients may connect to them and send read and write requests to them. Observers forward these requests to the Leader like Followers do, but they then simply wait to hear the result of the vote. Because of this, we can increase the number of Observers as much as we like without harming vote performance.
Observers have other advantages. Because they do not vote, they are not a critical part of the ZooKeeper ensemble — they can fail or be disconnected from the cluster without harming the availability of the ZooKeeper service. The benefit to the user is that Observers may connect over less reliable network links than Followers. In fact, Observers may be used to talk to a ZooKeeper server from another data center. Clients of an Observer will see fast reads, as all reads are served locally, and writes result in minimal network traffic since the number of messages required without the vote protocol is smaller.
How to Use Observers
Setting up a ZooKeeper ensemble that uses Observers requires just two changes to your config files.
First, in the config file of every node that is to be an Observer, add:
peerType=observerThis tells ZooKeeper that the server is to be an Observer.
Second, in every server config file, append :observer to the server
definition line of each Observer. For example:
server.1=localhost:2181:3181:observerThis tells every other server that server.1 is an Observer and that they
should not expect it to vote. This is all the configuration needed to add
an Observer to your ZooKeeper cluster. You can then connect to it as you
would an ordinary Follower:
$ bin/zkCli.sh -server localhost:2181where localhost:2181 is the hostname and port of the Observer as
specified in every config file. You should see a command line prompt
through which you can issue commands like ls to query the ZooKeeper service.
How to Use Observer Masters
Observers function simply as non-voting members of the ensemble, sharing the Learner interface with Followers and holding only a slightly different internal pipeline. Both maintain connections along the quorum port with the Leader by which they learn of all new proposals on the ensemble.
By default, Observers connect to the Leader along its quorum port to learn of new proposals. There are benefits to allowing Observers to connect to Followers instead as a means of plugging into the commit stream. This shifts the burden of supporting Observers off the Leader, allowing it to focus on coordinating the commit of writes. The result is better performance when the Leader is under high load — particularly high network load such as after a leader election when many Learners need to sync. It also reduces the total number of network connections maintained on the Leader when there are many Observers. Activating this feature allows the overall number of Observers to scale into the hundreds, and improves Observer availability since a large number of Observers finish syncing and start serving client traffic faster.
This feature is activated by adding the following entry to the server config file. It instructs Observers to connect to peers (Leaders and Followers) on the specified port, and instructs Followers to create an ObserverMaster thread to listen and serve on that port:
observerMasterPort=2191Example Use Cases
Wherever you wish to scale the number of clients of your ZooKeeper ensemble, or where you wish to insulate the critical part of an ensemble from the load of dealing with client requests, Observers are a good architectural choice. Two example use cases are:
- Datacenter bridge: Forming a ZooKeeper ensemble between two datacenters is problematic because high variance in latency between datacenters can lead to false-positive failure detection and partitioning. However, if the ensemble runs entirely in one datacenter and the second datacenter runs only Observers, partitions are not problematic — the ensemble remains connected. Clients of the Observers may still see and issue proposals.
- Message bus integration: Some use cases call for ZooKeeper as a component of a persistent reliable message bus. Observers provide a natural integration point: a plug-in mechanism can attach the stream of proposals an Observer sees to a publish-subscribe system, without loading the core ensemble.
Quorums
How to configure ZooKeeper quorums: hierarchical quorums using server groups and weights, and Oracle Quorum for increasing availability in two-instance clusters.
Dynamic Reconfiguration
How to use ZooKeeper's dynamic reconfiguration support (available since 3.5.0) to change ensemble membership, server roles, ports, and the quorum system at runtime without service interruption.