All posts

NATS Weekly #16

Week of February 28 - March 6, 2022

🗞 Announcements, writings, and projects

A short list of  announcements, blog posts, projects updates and other news.

⚡Releases

Official releases from NATS repos and others in the ecosystem.

📖 Articles

💬 Discussions

Github Discussions from various NATS repositories.

💡 Recently asked questions

Questions sourced from Slack, Twitter, or individuals. Responses and examples are in my own words, unless otherwise noted.

What happens when my NATS client disconnects?

When a NATS client connects, it specifies one or more URLs which represents a pool of servers to connect to. If connecting to a cluster, it is good practice to specify all of the node URLs (more on this below).

nats.Connect("n1.local:4222,n2.local:4222,n3.local:4222")

Regardless of how many URLs are specified, upon successful connection the server will share with the client all of servers it is aware of to the URL pool. This can be observed manually by using telnet to connect to the server and seeing the connect_urls:

$ telnet n1.local 4222

If you are not seeing this or you see a set of IP addresses or internal hostnames (like Kubernetes service routes), ensure your server config has cluster routes properly setup and has client_advertise defined.

This gossiping of the other addressable servers means that a client has a pool of URLs to attempt reconnection if a disconnect occurs. There could be two reasons for this, first is a network issue which results in a ping-pong timeout in which the client or server may close the connection. The second is simply that the server the client is connected to goes away for some reason, such as being restarted for a version upgrade, or for JetStream-enabled servers, migrated to new hardware.

When the client detects a disconnect, it will choose a random server URL from the pool (both client provided and server gossiped) and attempt to re-establish a connection.

If however, the client full disconnects and the client process restarts, only the URLs passed at startup will be used. As stated above, it is a good practice to specified multiple URLs on connection. The reason is that if any one of the servers is offline, it will try to connect to a different one. However, if only one URL is provided and that happens to be server that is offline, the client won’t be able to connect.

What is the consumer sequence on a message?

A member on Slack posted an example output of nats consumer info:

Last Delivered Message: Consumer sequence: 35,262,726 Stream sequence: 4,895,650 Last delivery: -0.01s ago

Acknowledgment floor: Consumer sequence: 33,962,533 Stream sequence: 4,795,612 Last Ack: 0.26s ago

The observation was that the consumer sequence was somehow (significantly) higher than the stream sequence. The assumption is that a consumer sequence would be less than or equal to the stream sequence since it’s targeting a subset of the messages in the stream, e.g. it could be applying a subject filter.

However, consumer sequence corresponds to the independent delivery and retries of messages to the consumer. Any time a message it attempted to be delivered, the sequence counter increments.

The subscriber can get this information along with the stable stream sequence for the message by accessing the message metadata. In Go it looks like this:

msg, _ := sub.NextMsg(time.Second)

// Will return an error if not a JetStream message.

md, _ := msg.Metadata()

// Access the consumer sequence

md.Sequence.Consumer

// Access the stream sequence

md.Sequence.Stream