Making Outbox Work

The last few days on developer nostr have involved quite a kerfluffle over the outbox model, blastr, banning jack, and many related misunderstandings. This post is an attempt to lay out my thoughts on the matter in an organized and hopefully helpful way.

What's wrong with outbox?

It all started with a post from jack asking why more devs haven't implemented the outbox model. There are many answers to this question, not least having to do with there being two standards for user relay selections, and ongoing changes to NIP 65. But I don't want to talk about compatibility here.

nostr:nevent1qydhwumn8ghj7argv4nx7un9wd6zumn0wd68yvfwvdhk6tcprfmhxue69uhhq7tjv9kkjepwve5kzar2v9nzucm0d5hszymhwden5te0wfjkccte9enrw73wd9hj7qpq2uf488j3uy084kpsn594xcef9g9x3lplx4xnglf0xwghyw2n3tfqqnrm02

Mazin responded with some numbers which estimate how many connections the outbox model requires. Too many connections can become expensive for low-power clients like mobile phones, not to mention some privacy issues stemming from nosy relays.

nevent1qythwumn8ghj76twvfhhstnwdaehgu3wwa5kuef0qyv8wumn8ghj7cm9d3kxzu3wdehhxarj9emkjmn99uq3samnwvaz7tmrwfjkzarj9ehx7um5wgh8w6twv5hsqgrn7l6zj7ht6ruyk76vvvtkfs4xrhyzc3tm64l3eyfvd40y26sz0gshmunh

I have some minor disagreements with Mazin's numbers, but I basically agree with his point — a purist outbox model, where a large proportion of nostr users run their own relays results in a high number of connections to different relays. I brought this question up late last year in my interview with Mike Dilger and in a conversation with fiatjaf, who convinced me that in practice, this doesn't matter — enough people will use a handful of larger hubs that there will be a good amount of overlap in relay selections between most pubkeys.

To articulate this more clearly: the goal is not "personal web nodes", which is a pipe dream the Farcasters and BlueSkys (BlueSkies?) of the world aim at, but a more pragmatic mix between large hubs and smaller purpose-built relays. These small relays might be outlets for large publishers, small groups, or nerds who also run their own SMTP servers and lightning nodes.

The point of the outbox model is that these small nodes be possible to run, and discoverable from the rest of the network so that we can preserve the censorship-resistant qualities of nostr that brought us here in the first place.

Blast It!

It's no secret that I've long been a critic of Mutiny's blastr relay implementation. My main objection is that the blastr approach doesn't account for the hard limits involved in scaling smaller relays. If the goal is to cross-pollinate notes across all relays in the network, all relays will require the same size database, and contain all notes in the network. This works right now (sort of), but as the network grows, the relays running on a $5 VPS are going to have their disks fill up and will inevitably fall over.

nevent1qyvhwumn8ghj76r0v3kxymmy9ehx7um5wgcjucm0d5hszxnhwden5te0wpuhyctdd9jzuenfv96x5ctx9e3k7mf0qythwumn8ghj7un9d3shjtnwdaehgu3wvfskuep0qqs07jr9qx49h53nhw76u7c3up2s72k7le2zj94h5fugmcgtyde4j9qfrnwxj

Not only that, but the content breakdown on any given relay by default becomes an undifferentiated soup of "GM", chinese notes, bots, bitcoin memes, and porn. Blastr makes it impossible to run an interesting relay without implementing write policies.

Which is actually fine! Because that's always been true — servers that allow anonymous uploads always get abused. Tony is just helpfully pointing out to us that this is no less true of nostr relays. I only wish he could have waited a little longer before mounting his attack on the network, because lots of hobbyists are interested in running interesting relays, but the tools don't yet exist to protect those servers from unsolicited notes.

One other note on blastr — Tony at one point described blastr as a relay proxy. This is an interesting perspective, which puts things in a different light. More on proxies later.

Ban Jack?

Here's a thought experiment: how might we actually "ban blastr"? @Pablof7z suggested to me in a conversation that you could configure your relay to check every note that gets published to your relay against the big nostr hubs, and if it exists on any of them to simply delete it. Of course, that would result in your relay being basically empty, and the hubs having all of your content. That's game theory for you I guess.

Another approach that was floated was to encourage users to only publish to small relays. In theory, this would force clients to implement outbox so users could still see the content they were subscribed to. Fiatjaf even posted two identical notes, one to his personal relay, and one to a hub to see which would get more engagement. The note posted to the mainstream relay got 10x more replies and likes than the more obscure note.

nostr:nevent1qyd8wumn8ghj7urewfsk66ty9enxjct5dfskvtnrdakj7qgmwaehxw309aex2mrp0yh8wetnw3jhymnzw33jucm0d5hszymhwden5te0wp6hyurvv4cxzeewv4ej7qpqdc2drrmdmlkcyna5kkcv8yls4f8zaj82jjl00xrh2tmmhw3ejsmsmp945r

Of course, this is thwarted by blastr, since blastr not only replicates notes posted to it, it also actively crawls the network as well. So the next logical step in this train of thought would be for hubs to encourage people to use small relays by actively blocking high-profile accounts.

nostr:nevent1qydhwumn8ghj7argv4nx7un9wd6zumn0wd68yvfwvdhk6tcpzdmhxue69uhhyetvv9ujue3h0ghxjme0qyd8wumn8ghj7urewfsk66ty9enxjct5dfskvtnrdakj7qpqpjhnn69lej55kde9l64jgmdkx2ngy2yk87trgjuzdte2skkwwnhqv5esfq

This would of course never happen (Damus is one client that hasn't implemented NIP 65, and they also run the biggest relay), but it was a fun thought experiment. At any rate, the silliness of the suggestion didn't stop certain people from getting offended that we would "disrupt the free market" by "forcing" our opinions on everyone else. Oh well.

Death to Blastr

In reality, even though blastr makes it a little harder to adopt outbox in the short term, its days are numbered. Eventually, relay operators will start to feel the pain of unsolicted notes, and will either shut their relays down or look for tools that will help them curate the content they host.

From my perspective, these tools take two forms — read protection and write protection. This is something I alluded to in my talk at Nostrasia last November.

Write protection is straightforward — already many relays have access control lists based on active subscriptions, invite codes, or just static whitelists that determine who is allowed to post to a given relay, or what event authors are represented there. This approach effectively prevents blastr from using relays as free storage, which is a huge improvement.

Read protection is more tricky, because anything publicly readable will be scraped by blastr and replicated to unauthenticated-write relays across the network. In most cases, this is ok, but there are use cases for relays to exist that host a unique collection of notes oriented around some organizing principle. Unfortunately, with blastr in action (or any scraper that might exist), the only way to do this is to actively protect proprietary content. There are a few approaches that can work to make this happen:

  • IP-based access control lists
  • AUTH-based access control lists
  • Stripping signatures when serving events
  • Storing and serving encrypted content

Each of these approaches has its own set of trade-offs. But depending on use case, any of them or a combination of them could work to allow relay operators to carve out their own piece of the nostr-verse. In fact, this is a big part of what Coracle is about — the white-labeled version of the product confines certain notes to proprietary relays, with optional encrypted group support.

Enough of my polemic against blastr. Let's talk about how to make the outbox model actually work.

Hints are pointless

Right now, clients that implement the outbox model rely pretty heavily on relay hints to find related notes — whether user profiles, reply parents, or community definitions. The problem with hints is that they are prone to link rot. Many of the relays that were set up a year ago when nostr took off are no longer online, and yet they persist in user relay lists, and in relay hints. These hints can't be updated — they are set in stone. What this means is that a different mechanism has to be used to find the notes the hints were supposed to help locate.

Because of this, I've come around to the position that hints are basically pointless. They are fine as a stopgap, and might be appropriate for certain obscure and ill-defined use cases where relay urls are the most durable address type available. But they provide basically no value in supporting the long-term robustness of the network.

What are durable, however, are pubkeys. Pubkeys are available pretty much everywhere, except in event id hints — and there is a proposal in the works to add a pubkey to those too. The cool thing about pubkeys as hints is that once you have a pubkey, all you need to do is find that person's kind 10002 inbox/outbox selections, and you should be able to find any note they have published.

This goes with the caveat that when users change their relay selections, or rotate their key, they (or their relays) should be sure to copy their notes to the new relay/pubkey.

The question then is: how do I find a given pubkey's relay selections?

There are already several mechanisms that make this reasonably easy. First of all, NIP 65 explicitly recommends publishing relay selections to a wide range of relays. This is a place where the blastr approach is appropriate. As a result, relay selections are usually available on the most popular public relays. Then there are special purpose relays like purplepag.es, which actively seek out these notes and index them.

These indexes are not confined to relays either. It would be trivial to create a DVM that you could ask for a pubkey's relay selections, optionally for a fee. Alex Gleason's proxy tag could also be used to indicate indexes that exist outside the nostr network — whether that be torrents, DHT keys, or what have you.

The best part is that this doesn't negatively impact the decentralization of the network because in principle these indexes are stateless — in other words, they're easily derived from the state of the public part of the nostr network.

Just do it for me

Looping back to where we started — the complexity and technical challenges of implementing the outbox model — there is a simple solution that many people have experimented with in different ways that could solve both issues at once: proxies.

As I mentioned above, Tony thinks of blastr as a proxy, and he's right. More specifically, it's a write-proxy. This is only part of its functionality (it also acts as an independent agent which crawls the network. EDIT: apparently this is not true!), but it is an essential part of how people use it.

Another kind of proxy is a read proxy. There are several implementations of these, including my own multiplextr proxy, which is outbox-compatible (although it requires a wrapper protocol for use). The advantage of a proxy like this is that it can reduce the number of connections a client has to open, and the number of duplicate events it has to download.

Proxies can do all kinds of fancy things in the background too, like managing the outbox model on behalf of the client, building an index of everything the user would be likely to ask for in advance to speed up response times, and more.

One interesting possibility is that a NIP 46 signer could double as a proxy, reducing the number of round trips needed. And since a signer already has access to your private key, this kind of proxy would not result in an escalation in permissions necessary for the proxy to work.

It's simple

The number of cool and creative solutions to the content replication and indexing problem is huge, and certainly doesn't end with blastr. Just to summarize the next steps I'm excited to see (to be honest, I want to build them myself, but we all know how that goes):

  • More clients supporting outbox
  • Outbox implementations maturing (Coracle's still has some issues that need to be worked out)
  • A shift from relying on relay hints to relying on pubkey hints + relay selection indexes of some kind
  • Proxy/signer combos which can take on some of the heavy lifting for clients of delivering events to the right inboxes, and pulling events from the right outboxes

Let's get building!