5 min read

NATS Weekly #32

NATS Weekly #32

Week of June 20 - 26, 2022

ğŸ—ž Announcements, writings, and projects

A short list of  announcements, blog posts, projects updates and other news.

⚡Releases

Official releases from NATS repos and others in the ecosystem.

⚙️ Projects️

Open Source and commercial projects (or products) that use or extends NATS.

💬 Discussions

Github Discussions from various NATS repositories.

💡 Recently asked questions

Questions sourced from Slack, Twitter, or individuals. Responses and examples are in my own words, unless otherwise noted.

What are the options for fault tolerance, availability, and disaster recovery?

If you are new to NATS, or specifically new to JetStream, you will likely hear or read terms like R1 and R3, replicas, placement, mirrors, and sources. The question is, what do these all mean in JetStream?

To level-set, I want to define these terms as it applies to streams:

  • fault tolerance - a stream can tolerate a partial failure and still support reads and writes without downtime (client retries may be necessary)
  • availability - stream data is available to be consumed in (at least) a partially consistent state
  • disaster recovery - a stream can be recreated from backups

Fault tolerance and availability are often mentioned in conjuction since being fault tolerant implies availability. However, the reverse is not necessarily true and we will see an example below of this.

A JetStream stream persists messages in-memory or as files on disk, depending on the storage type chosen. By default, a stream has one replica, that is one copy of the data (referred to as R1 for shorthand) and is placed on any one of the available JetStream-enabled servers in the cluster.

Given an R1 stream with in-memory storage, if the server restarts, the stream data will be lost. For a file-based storage, the data will not be lost, but the stream will be unavailable for reads or writes (publishing and consumption) until the server comes back online. However, if the disk underlying the server fails or gets corrupted, the data may not be recoverable.

With one replica, the stream can not tolerant a partial failure and would result in being unavailable if the replica itself had a fault or the server became unreachable due to a network partition.

To acheive a fault tolerant stream, at least three replicas (R3) are required. Stream (and consumer) replication is done via Raft consensus which means when a new message is stored, it is guaranteed at least a quorum (majority) of the replicas acknowledged the write. This means an R3 stream can tolerate one replica being unavailable while still accepting reads and writes (since two are still available which are the majority). Likewise for an R5 stream, two replicas could be unavailable and progress can still be made (no more than five replicas are supported in JetStream since there are diminishing returns on availability while performance gets a worse).

What about an R2? or R4?  Since consensus relies on a majority to be available, an even numbers of replicas is generally not useful from an availability standpoint. For example, with R2, if the leader becomes unavailable the other replica won't be able to make progress on its own. That said, an R2 stream does at least store an additional (consistent) copy of the data so it is technically more tolerant to storage faults or data corruption.

To decrease the probability of becoming unavailable (due to a fault or network partition) you can (optionally) be deliberate with the placement of the replicas across a cluster. This is done using server tags which essentially matches the tags defined configured on a stream to a set of servers the replicas can be placed on, for example across multiple availability zones.

In addition to consensus-based replication internal to streams, JetStream also supports asynchronous replication of streams in the form of mirrors and sourcing streams.

Mirrors are commonly used as a lightweight, eventually consistent, copy of a primary stream in some other locality, such as a different region or on a leaf node. The reason is because if the primary stream is unavailable, the mirror can still be consumed from. Importantly, the configuration of this stream can be different from the source, such as in-memory and R1. If the mirror happens to fail, it can always be re-sourced from the primary.

A common use case for sourcing, is to aggregate messages across streams into one. Consumers can then be created on the aggregate stream rather than requiring multiple consumers across the source streams. Again, the configuration of this aggregate stream can be different from the sources.

What about disaster recovery?

In this context, a disaster boils down to unrecoverable data loss in all or a majority of the stream's replicas. In practice this should be, of course, extremely rare especially with an R3 or R5 setup. However, it can happen due to hardware failure, operator error, or a bug in the software, be the client code or NATS itself (of course there is extensive testing in NATS.. but anything is possible).

The premise of recovery boils down to having a reliable copy of the data that is not affected by the disaster that caused the issue in the first place. There are a few options depending on the kind of failure/error being guarded against:

  • Stream mirrors in other region
  • Consumer that writes batches of messages to object storage
  • Offline backups of stream data

This list is in order of most-realtime/minimal data loss to least, but it is also in order of decreased risk if there are software bugs (application or NATS).

How can you create account and user JWTs programmatically?

Thanks to Deepak Sah for providing sample code and Matthias Hanel for providing guidance and expertise.

The primary (and recommended) way to create and manage accounts and users is using the nsc command-line tool. However, in some applications and use cases, it may be desirable to programmatically create accounts or users on-demand as part of an application-level account/user workflow rather than out-of-band on the command line (however, shelling out to nsc from your program is another option).

Regardless of whether accounts or users need to be created programatically, the operator will likely always be created using nsc. You can read here how to bootstrap the operator and system account.

Once at least the operator is created, you can view the seed by using the following command. If you are using signing keys (which you should), be sure to copy that associated seed.

nsc list keys --show-seeds --operator

If you are managing accounts with nsc as well, then you can view those seeds with:

nsc list keys --show-seeds --accounts

Below are the basic structure of createAccount and createUser which are identical in process, only differing on what is being created and the seed that does the signing. The full implementation of these snippets is available here.

func createAccount(operatorSeed, accountName string) (string, error) {
    // Create the account public/private key pair.
    akp, _ := nkeys.CreateAccount()
    
    // Use only the public key for the JWT.
    apub, _ := akp.PublicKey()
    
    // Create account claims and set the name. Other claims
    // could be set here...
    ac := jwt.NewAccountClaims(apub)
    ac.Name = accountName

    // Load operator key pair.
    okp, _ := nkeys.FromSeed([]byte(operatorSeed))

    // Sign the claims and encode to a JWT string.
    ajwt, _ := ac.Encode(okp)

    return ajwt, nil
}

func createUser(accountSeed, userName string) (string, error) {
    // Create the user public/private key pair.
    ukp, _ := nkeys.CreateUser()
    
    // Use only the public key for the JWT.
    upub, _ := ukp.PublicKey()
    
    // Create user claims and set the name. Other claims
    // could be set here...
    uc := jwt.NewUserClaims(upub)
    uc.Name = userName

    // Load operator key pair.
    akp, _ := nkeys.FromSeed([]byte(accountSeed))

    // Sign the claims and encode to a JWT string.
    ujwt, _ := ac.Encode(akp)

    return ujwt, nil
}

If you would like more in-depth information with examples to any of these questions, please reach out on Slack or Twitter!