What is the Outbox Model?

Nostr is a mess. It always has been and will always be. That's part of the appeal! But it's important that users be able to navigate the rolling seas of this highly partition-tolerant network of kaleidoscopically-interwoven people, bots, topics, relays, clients, events, recommendations, lists, feeds, micro-apps, macro-apps, Chinese spam, and "GM"s.

In order to do this, users must be able to articulate "what" they are looking for, and clients must be able to articulate "how" to find that thing. This "how" is divided into two parts: building a request that will match the desired content (very easy), and selecting a relay that is able to serve that content to the user requesting it (very very hard).

Why guessing isn't good enough

As a concrete example, let's say the user wants to find everyone in their "network" who is using a particular topic. The process would look something like this:

  1. The user clicks the "network" tab and types in the topic they want to browse. This is the "what".
  2. The client then translates the term "network" to a list of public keys using whatever definition they prefer (Follows? WoT? Grapevine?), and builds a filter that might look something like this: [{"authors": pubkeys, "#t": ["mytopic"]}]. Any relay will happily accept, understand, and respond to that filter.
  3. The client then has to decide which relays it should send that filter to. This is the ??? stage of the outbox model, which immediately precedes:
  4. Profit

It may not be immediately obvious why selecting the correct relays might be difficult. Most people post to relay.damus.io, and most people read from relay.damus.io, so in most cases you should be good, right?

This approach to relay selection has historically worked "well enough", but it depends on a flawed definition of success. If you only want to find 90% of the content that matches your query, using the top 10 relays will suffice. But nostr is intended to be censorship-resistant. What if those 10 hubs have banned a particular public key? Nostr clients should (at least in theory) be 100% successful in retrieving requested content. Even if someone only posts to their self-hosted relay, you should be able to find their notes if their account is set up properly.

A naive solution to fixing the FOMO

A 90% hit rate results in a feeling of flakiness, even if users aren't completely aware of what isn't working. Feeds will be incomplete, quoted notes will be missing, replies will be orphaned, user profiles won't load. The natural response to the FOMO this creates is for users to "try harder" by adding more relays.

On the read side, this means clients open more connections, resulting in much higher data transfer requirements, with massively diminishing returns, since there's no reason to expect that a randomly chosen relay will have a substantially different data set.

One the publish side, this means that clients end up publishing more copies of their data to more relays. This approach has been automated in the past by services like Blastr, which don't store a copy of events published to the relay, but instead forward events to the top 300 relays in the network. This results in a two-orders-of-magnitude increase in storage required, and only makes the read side of the problem worse, since it reduces the uniqueness of the data set each relay stores. This in turn means that more duplicates are retrieved when querying relays.

Both halves of this approach are equivalent to guessing. On the read side, users are guessing which relays will have any arbitrary content they might ask for in the future. On the write side, users are guessing which relays other people might use to find their notes. It is a brute-force method for finding content.

Randomness results in centralization

In theory, random relay selection would result in a perfect distribution of content across all relays in the network. But in practice, this method of selection isn't random at all, but is strongly influenced by user bias in what constitutes a "good" relay. While some users may check nostr.watch for ping times, geographical proximity, or uptime, most will choose relays based on familiar names or other people's recommendations.

In either case, these biases are entirely orthogonal to achieving a higher content retrieval hit rate, except when bias in relay selection results in clustering — i.e., centralization. In other words, the kind of randomness exhibited by users when selecting relays actually results in pretty much everyone picking the same few relays. We see this same effect when people try to come up with passwords or seed phrases — human-provided randomness is anything but random.

Clustering improves the hit rate when requesting events (slightly), but it results in nearly as much centralization as if only a single relay was used — and a lot more duplicate events.

Something (anything) other than randomness

In early 2023, Mike Dilger introduced NIP 65 (now known as the "Outbox Model") with a problem statement in the spirit of the original description of nostr: "Nostr should scale better. People should be able to find what they want."

Historical note: NIP 65 was formerly known as the "Gossip Model", derived from the name of Mike's desktop nostr client, called "Gossip". This unfortunately created a lot of confusion, since gossip protocols work very differently from how nostr tends to work, hence the re-brand.

Before NIP 65, an informal standard existed in which kind 3 user contact lists also included a list of relays that clients could use as something similar to Mastodon's "home servers". This list included the option to only read or write from a given relay. Unfortunately, it wasn't really clear what the semantics of this relay list were, so different clients handled them differently (and many clients ignored them). Usually this amounted to user-provided static relay configurations, which resulted in the naive relay selection approach described above.

NIP 65 used a very similar format (a list of relay urls with optional "read" or "write" directives), but with a very important semantic difference: relays listed in a user's kind 10002 were intended to "advertise to others, not for configuring one's client." In other words, these relay selections were intended as a signal to other users that they should use certain relays when attempting to communicate with the author of the relay list.

I highly recommend reading the entire NIP, which is very short and easy to read. But the mechanics of the spec are very simple:

When seeking events from a user, Clients SHOULD use the WRITE relays of the user's kind:10002.

When seeking events about a user, where the user was tagged, Clients SHOULD use the READ relays of the user's kind:10002.

When broadcasting an event, Clients SHOULD:

  • Broadcast the event to the WRITE relays of the author
  • Broadcast the event to all READ relays of each tagged user

For the first time, we had a way to differentiate relays in terms of what content could be found where.

When looking for a note by a particular user, a client could now look up the author's write relays according to their kind 10002 event, and send its query there. The result is a much higher hit rate with much lower data transfer requirements, and fewer connections per query.

Making Outbox Work

There are of course some assumptions required to make this work.

First, the user must know which author they're looking for. This isn't always true when looking up a quote or parent note, but context and pubkey hints solve this difficulty in most cases.

The author must also publish a kind 10002 event. This may not always be the case, but clients should prompt users to set up their relay list correctly. This isn't really a flaw in the Outbox Model, just in implementations of it.

Additionally, the user's client must be able to find the author's kind 10002 event. This is the "bootstrapping" phase of the Outbox Model, during which the mechanisms the system provides for finding events aren't available. This requires us to fall back to randomly guessing which relays have the content we're looking for, which as we saw above doesn't work very well.

Other than guessing, there are a few different ways a client might find the relay selection event in question, each of which is applicable in different circumstances. In most cases, using one of a handful of indexer relays like purplepag.es or relay.nostr.band is a simple and efficient way to find user profiles and relay selections.

However, if an author's content has been aggressively purged from these indexers due to censorship, they obviously can't be relied upon. Even though the author in question hasn't been deplatformed from nostr itself (since he can always self-host a publicly accessible relay to store his content), he has been effectively shadow-banned.

To get around this, relay selections have to be communicated in some other way. Nostr has a few different mechanisms for this:

  • If the author's NIP 05 address is known and properly configured (it may not be), clients can look up the author's NIP 05 endpoint to find some reasonable relay hints. Unfortunately, these are often neglected, and usually custodial, so they can run into the same problems.
  • If the author's pubkey is found in another signed event found on nostr, relay hints can be a way to propagate relay selections through the network. This relies on implementations picking reliable relay hints which can be difficult, and hints do tend to become less reliable over time. However, this strategy is very effective in resisting censorship because it makes banning viral — if a relay wants to completely purge a particular pubkey from their database, they have to purge every event that references it, since events are tamper-proof.
  • In extremis, relay recommendations can always be communicated out-of-band. This can be done using manual input, QR codes, DHTs, jsonl torrents full of kind 10002 events, or any other mechanism client developers choose to resort to.

Another, more technical assumption is that any given query can be fulfilled by few enough relays that a client can actually make all the connections needed, without running into resource limits. If you're trying to request content from 10,000 users across 1,000 relays, you're going to have a bad time. This was pointed out to me by Mazin of nostr.wine. He makes a good point, and it's definitely something to keep in mind. There are some mitigating factors though.

The first is that the current topology of the network probably won't persist forever. Because nostr is largely populated by self-hosting enthusiasts, the number of "tiny" relays is proportionally much higher than it will be if adoption picks up, even if the total number of relays grows. The trajectory is that nostr will drift toward fewer, larger relays, reducing the number of connections needed to fulfill any given query.

This is "centralizing", but it's important to understand that this isn't necessarily a bad thing. As long as there are more than one or two large hubs, there is user choice. And as long as it's possible to run a new relay, there is always an escape hatch. Nostr, like bitcoin, has no hard dependency on the biggest player in the network.

The other thing to consider is that there are lots of other techniques we can use to overcome the limits of the lowest-common denominator's limitations (mobile browser clients), including self hosted or third-party relay proxies. The trade-off here is that a little trust (aka centralization) can go a long way to reducing resource requirements needed to fulfill queries using the Outbox model.

If you're interested in more details on this topic, see this blog post.

That was a long digression, but there is one other thing that the Outbox model assumes to be the case. Even if the correct relays are found and connected to, they still may not return all desired content, either because they don't have it, or because they refuse to return it to the user requesting it.

This can happen if the publishing client isn't following the Outbox Model, if the author had migrated from one relay set to another without copying their notes over, or if the relay in question chose not to retain the author's content for some reason.

The first two issues can be fixed by improving implementations, but the question of policy is a little more interesting.

Relativistic relays

The Outbox Model is a mechanical process; it's only as useful as user relay selections are. In order for it to work, users have to be able to make intelligent relay selections.

Every relay has trade-offs, depending on its policy. 140.f7z.io would not be useful for long-form content, for example. Some relays might have a content retention policy that changes depending on whether you're a paying user. If you don't pay, you might find out too late that your content has been deleted from the relay.

So what makes a relay "good" for a particular use case? Well, it's complicated. Here are a few factors that go into that calculus:

  • Is the relay in the same geographical as the user? Proximity reduces latency, but jurisdictional arbitrage might be desired. Users should probably have a variety of relays that fit different profiles.
  • Will the relay ban the user? Do the operators have a history of good behavior? Is the relay focused on particular types of content? Is the relay's focus consistent with the user's goal in adding that relay to their list?
  • What are the relay's retention policies? A user might want to set up an archival relay for her old content, or a multi-availability-zone relay so her notes are immediately accessible to the rest of the network.
  • Does the relay require payment? Paid relays are more aligned with their users, but obviously come at a financial cost.
  • Does the relay have policies for read-protecting content? If so, other users might not be able to find your posts published to that relay. On the other hand, some relays are configured to work as inboxes for direct messages, which can help preserve privacy.
  • Does the relay request that users authenticate? Authentication can help manage spam, but it also allows relays to correlate content requests with users, reducing user privacy.
  • Is the relay you use hosted by your client's developer? If so, you're in danger of getting banned from your client and your relay at the same time.
  • Is the relay a hub? Using hubs can help smooth out rough areas in Outbox Model implementations, at the cost of centralization.
  • Is the relay used by anyone else? One-off relays can be useful for archival purposes, but often won't be used by clients following the Outbox Model, depending on how they optimize requests.

There are lots of ways to approach the problem of helping users select relays, but it's an inherently complex problem which very few people will have the patience to properly address on their own. Relay selection is a multi-dimensional problem, and requires satisfying multiple constraints with a limited number of relay selections.

In the future, special-purpose clients might be used to help people build relay sets. Clients also might provide curated "relay kits" that users can choose and customize. Or, we might see an increase in hybrid solutions, like smarter relay proxies or client-local relays that synchronize using other protocols or platforms.

The Limitations of Outbox

Outbox is not a complete solution, not because of any of the caveats listed above, but because NIP 65 per se only addresses the question of how to index content by pubkey in a broadcast social media context. But there are many other scenarios for relay selection that Outbox does not solve:

  • Community, chat, and group posts might be best posted to relays dedicated to that context.
  • Direct messages shouldn't follow the same contours as public social media content.
  • Topic-oriented relays, or relays serving a custom feed might be useful independent of who uses them.
  • Relays focused on serving a particular kind of event, like music, long-form content, or relay selections, are useful independent of who reads from or writes to them.
  • Certain clients might need to fulfill particular use cases by using relays that support certain protocol features, like search, count, or sync commands.
  • Some events might not make sense to publish to relays, but should instead be shared only directly, out of band.

Some of these use cases might be solved by new specifications similar to Outbox that prescribe where certain data belongs — for example, NIP 17 requires users to publish a different relay list before they can receive direct messages, while NIP 72 places community relay recommendations directly into the group's metadata object. A reasonably complete list of different relay types can be found in this PR, very few of which have a canonical way to manage selections.

Other use cases might be supported more informally, either by relays advertising their own value proposition, or via third-party NIP 66 metadata. Still others might be supported by scoping the network down to only certain relays through explicit relay selection — this is how white-labeled Coracle instances work.

The basic idea here is that there are categories of events that don't have anything to do with where a particular person puts his or her "tweets". For every "what" on nostr, there should be a "how".

Keep nostr weird

Whatever additional systems we end up adopting for helping with relay selection, one thing is certain — people will continue to discover new, creative uses for relays, and we will always be playing catch up. This is one of the coolest things about nostr!

But it does mean that users will have to adapt their expectations to a network that partitions, re-configures, and evolves over time. Nostr is not a "worse" experience than legacy social media, but it is a version of social media that has itself been set free from the stagnant walled-garden model. Nostr is in many ways a living organism — we should be careful not to impose our expectations prematurely, leaving room to discover what this thing actually is, or can be.

If you enjoyed this post but want more take a look at the talk I gave at Nostrasia last year. I also wrote up a blog post at about the same time that addresses some of the same issues, but focuses more on privacy concerns around relays and nostr groups. Finally, I recently wrote this comment, which includes some details about challenges I've faced putting Outbox into Coracle.