NATS Weekly #39

Week of August 8 - 14, 2022

🗞 Announcements, writings, and projects

A short list of announcements, blog posts, projects updates and other news.

️ 💁 News

News or announcements that don’t fit into the other categories.

Synadia has an open position for an Operations Engineer!
(Very new) draft proposal for logical permissions and the concept of user roles. Feedback/upvotes welcome.

⚡Releases

Official releases from NATS repos and others in the ecosystem.

nats-surveyor - v0.3.0

🎬 Media

Audio or video recordings about or referencing NATS.

Rethink Connectivity Episode 5: How to build a NATS Supercluster - Jeremy Saenz, Synadia

🧑‍🎓 Examples

New or updated examples on NATS by Example .

Push Consumers - Go

💡 Recently asked questions

Questions sourced from Slack, Twitter, or individuals. Responses and examples are in my own words, unless otherwise noted.

What permissions are required to interact with a stream?

JetStream introduces an API which manifests as a set of subjects prefixed by $JS.API. For example a large proportion of the API is for management of streams and consumers such as creating them, updating their configuration, viewing their state, and deleting them.

When using a NATS client library or the CLI, these subjects are abstracted away as commands, methods, or functions such with the JetStreamManager in the Go client which provide methods like AddStream and UpdateStream.

Although this convenience is baked into the clients, this logical abstraction is not (yet) implemented when defining user permissions which rely on explicit subjects to be allowed (or denied) to be published or subscribed to.

Until some form of logical permissions are supported, the current solution is to specify the explicit JS subjects required for the desired permissions. Check out the in-progress listing of subjects corresponding to the logical name here. This will be incorporated into the reference documentation in the future.

As a quick example, if you want to allow a user to manage a stream with a pre-defined name such as ORDERS, permissions would be defined as follows.

permissions: {

allow: [

"$JS.API.STREAM.CREATE.ORDERS",

"$JS.API.STREAM.UPDATE.ORDERS",

"$JS.API.STREAM.INFO.ORDERS",

"$JS.API.STREAM.DELETE.ORDERS",

"$JS.API.STREAM.PURGE.ORDERS",

# etc..

],

}

This is a fairly common question since a consumer can only be bound to a single stream. There are some use cases where streams are created to support heavy writing (publishing) of messages and are located near the producer source. However, consuming those messages may not be as time sensitive and the messages are more useful to be consumed in aggregate. Many forms of control plane type of use cases including analytics, monitoring, operational events, etc. fall into this category.

There is a special stream type called a sourcing stream which defines one or more streams it will source messages from into its own local copy. Imagine point-of-sale systems (say per store) logging transactions in a stream and then those streams are being sourced into an stream for out-of-band/aggregate processing

js.AddStream(&nats.StreamConfig{

Name: "POS-aggregate",

Retention: nats.InterestPolicy,

Replicas: 3,

Storage: nats.MemoryStorage,

Sources: []nats.StreamSource{

{Name: "POS-1"},

{Name: "POS-2"},

{Name: "POS-3"},

// etc..

},

})

The benefit of this approach is two-fold. First, any number of consumers can now be created and bound to this stream for processing these events and it takes off any load on the source streams.

Second, shown above, a unique retention policy can be defined on this stream separate from the source stream. In this case, we are using an interest-based stream to only keep these aggregate messages around until all interested consumers have processed the messages. This guards against a copy concern which is maintaining a complete second copy of the data from the source streams. Instead, this balances interest of the messages by consumers, but doesn’t retain them long-term.

Similarly, maybe this stream can still be replicated, but be in-memory to not take up any local disk (obviously these preferences will vary based on the use case). But the point is, is that sourcing streams (and mirrors) provide this decoupling from the primary streams in terms of their configuration and consumption path.

Can a stream be placed across two clusters?

With the interesting number of folks deploying their first supercluster, a common question that arises is, “can I create a stream that spans two clusters?” The short answer is no, but there is a reason for this independent of the fact that gateway connections are designed different than cluster routes 😉.

However, a related follow-up question is usually, “can I spread a stream out across multiple regions,” and that is a soft yes.

First, what do we mean by a stream spanning clusters. When creating a stream, you can define a stream to have one, three, or five replicas (two and four replicas often don’t make sense in this context). For R3 or R5 replicas, the guarantee is that a quorum of replicas (2/3 or 3/5) receive an acknowledge a message on write (publish) before an acknowledgement is sent back to the client with an OK.

Each replica should live on a separate node in the cluster so that if any one node goes down, the other replicas are unaffacted.

Stream replication relies on Raft consensus, which is leader-based. This means that a client publishes to the leader (one network hop) and then leader broadcasts the message out to the replicas (another hop) and waits until a quorum acks.

The recommendation for a NATS cluster deployment is within a single region. In general, nodes should be spread out across multiple availability zones (AZ) and replicas placed on disjoint AZs to achieve zonal fault tolerance. Since cross AZ latencies are quite small, this means the round-trip time (RTT) of a message publish can be on the order of low double-digit (possibly single-digit) milliseconds depending on the configuration.

So to answer the first question, a stream doesn’t span two clusters because the Raft group and its state are managed among the nodes in the cluster, but also because of latency considerations.

Although a single region is recommended, nodes in a cluster can technically be deployed across multiple regions. The latency will increase, but for rare use cases and lots of performance testing, it is possible for this kind of deployment. In general, it is recommended to keep the RTT between two nodes below 200ms otherwise it will degrade the consensus behavior.

Of course, if you have other interesting topology questions, feel free to reach out to discuss!