By an accident of history, I have been knee-deep for the last week or two in several NIPs related to encrypted DMs and group chat. The accident is my proposal for encrypted groups — which will help me turn Coracle into something approximating a replacement for Facebook, rather than Twitter.
In order to make that happen though, a few things need to happen first. I need a better way to encrypt messages without leaking metadata, and (un)fortunately I'm no cryptographer. It's well known that NIP 04 has a number of issues, and there have been several proposals to fix it since as early as October 2022. Building on that foundation would be a waste of time.
My purpose in this post is to outline the problems with NIP 04 as it stands, and sketch out for you some possible solutions applicable not only to DMs, but to other encrypted use cases.
NIP 04 Considered Harmful
It's pretty well-known that nostr DMs leak metadata. What this means is that when you send a DM to someone else, anyone can see:
- Who you are
- Who you sent a message to
- When you sent the message
- How big the message was
And, of course, vice versa. Now, I personally don't take the cyanide pill on this like some do. DMs in centralized social platforms are far less private even than this, since the platform can read them anytime they want, share them with law enforcement, and potentially leak them to the entire world. It's my opinion that if applications were designed with the above properties in mind, it would be possible to gamify DMs, making metadata leakage a feature rather than a bug. Of course, no one has designed such a thing, so maybe it's a moot point.
The much bigger problem, as I learned last week, has to do with the cryptography used in NIP 04 (and several other places, including private mute lists and app-specific data). Here's a quote from Paul Miller's very helpful explanation:
- Unhashed ECDH exposes some curves to Cheon's attack 1, which allows an attacker to learn private key structure when doing continuous diffie-hellman operations. Hashing converts decisional oracle into a computational oracle, which is harder.
- CBC IVs should not only be unrepeatable, they must be unpredictable. Predictable IV leads to Chosen Plaintext Attack. There are real exploits.
- CBC with reused key has 96-bit nonces and loses 6 bits of security for every magnitude. 10**4 (10K) messages would bring archive security of the scheme to 72 bits. 72 bits is very low. For example, BTC has been doing 2^67 hashes/sec as per early 2023. Advanced adversaries have a lot of resources to break the scheme
- CBC has continuously been exploited by Padding Oracle attack. We can't expect all implementations to have proper padding check
- AES was modeled as ideal cipher. Ideal ciphers don't care about non-random keys. However, AES is not an ideal cipher. Having a non-IND key, as in NIP4, exposes it to all kinds of unknown attacks, and reducing security of overall scheme
There is a lot going on here, very little of which I (and likely you) understand. But the really horrifying takeaway for me is in items 1 and 3. If you read it carefully, you'll notice that the more times you use your private key to encrypt data using NIP 04's encryption scheme, the easier it is for an attacker to brute-force your key.
I want to be careful not to overstate this — I don't think this means that everyone's private key is compromised; it would require many thousands of signatures to meaningfully reduce the security of your key. But it does mean that NIP 04's days are numbered — and if we don't replace it, we have a real existential threat to the entire protocol on our hands.
So much for motivation to put NIP 04 to rest. Where do we go from here?
The lay of the land
There are a few different components to making private messages work (whether we're talking about DMs or encrypted groups).
- Encryption scheme
- Explicit metadata hiding
- Implicit metadata hiding
- Transport and deliverability
- Identity and key exchange
Each of these components needs to be solved in order to create a robust messaging system on nostr.
Encryption scheme
As mentioned above, NIP 04 is borked. Luckily, there are a few folks in the nostr dev community who know a thing or two about encryption, most notably Paul Miller. Way back in May, Paul proposed a new encryption scheme based on his implementation of XChaCha20, which is the same algorithm used by SMP (more on that below).
To move the proposal forward, I created a different PR with some edits and adjustments, which is very near to being merged. The implementation already exists in nostr-tools, and there are PRs out for nos2x and Alby as well.
Explicit metadata hiding
As mentioned above, normal nostr messages share lots of information publicly, even if the content is encrypted, including:
- Kind
- Created timestamp
- Tags (topics, mentions, recipients, etc)
- Message size
- Author
Private messages have to hide this information, at the very minimum.
Implicit metadata hiding
The above level of security should be enough for most use cases. But it definitely doesn't get us to total secrecy yet. Some additional metadata includes:
- Client fingerprinting
- Relay selection for publishing and queries
- IP address collection
- First seen timestamp
- Identification of users by AUTH
- Correlation of the event with other messages sent/received during a session
These issues are much harder to solve, because they are part of the process of delivering the message, rather than just constructing it. These issues can be mitigated to some extent by using TOR, proxying connections with relays, using short-lived connections, randomizing publish time, and careful selection of trusted relays. But all of these are only techniques to reduce exposure, and require a prescriptive protocol to wrap them all up into a cohesive whole.
Transport and deliverability
In my view, the key feature of nostr that makes it work is relays. Relays provide storage redundancy and heterogeneity of purpose with a common interface shared by all instances. This is great for censorship resistance, but everything comes with a tradeoff.
Publishing notes is sort of like hiding easter eggs. If you want a kid to be able to find an egg, you have to put it somewhere they'll look for it. Putting an egg on the roof does not constitute "hiding" it, at least in a way consistent with the spirit of the game.
Historically, notes have been broadcast to whatever relays the user or client selects, without regard with someone else's preferences, resulting in apparently missing notes when author and recipient don't share a relay.
This problem has been partly solved by NIP 65, which encourages clients to publish notes where the recipient will look for them. There is some guesswork involved in selecting good relays, but the process is fairly straightforward for direct messages and group chats.
The situation can be improved further by DM-specific signaling, like what @fiatjaf has suggested as an addition to NIP 65. A user could then run his own relay that accepts messages from anyone, but only serves messages to himself. This would be the only relay well-behaved clients would ever write encrypted messages to, improving both privacy and deliverability.
Variants of this could also work for encrypted groups. For small groups, the same relay hints could be used as with DMs, and for larger groups with key sharing the admin could set up a dedicated relay for the group which members would write to.
This also creates a potential affordance for querying of encrypted events. If a relay is highly trusted, it could be added as a member to an encrypted group, gaining the ability to decrypt events. It could then receive REQ messages for those events, filter based on the encrypted content, and serve the encrypted version. This could greatly improve performance for larger groups where downloading all events isn't feasible.
Identity and key exchange
One dimension of privacy that deserves its own section is identity and key exchange. In nostr, this is simpler than in other protocols because in nostr your pubkey is your username. Arguably, this is a flaw that will need to be solved at some point, but for now it means that we get to skip a lot of the complexity involved with binding a name to a key.
However, the problem is still relevant, because one effective way to improve privacy is through key exchange. In the case of DMs, it would be nice to be able to request an alias key from someone you want to communicate with in order to hide the only piece of metadata that has to remain on a message in order to deliver it, that is, the recipient.
There's a lot of good stuff about this problem in the Messaging Layer Security protocol. The author of 0xChat has also put together a draft spec for executing simple key exchange, albeit with less information hiding than in the MLS protocol.
Key exchange between individual users can be considered an optional privacy enhancement, but is required for group chat based on a shared key. In either case, exchanging keys is fairly straightforward — the real challenge is achieving forward secrecy.
By default, if at any point a user's private key is compromised, any messages that were retained by an attacker can then be decrypted. This is a major vulnerability, and solving it requires key exchange messages to be reliably discarded.
What's cooking
So that's an overview of the problems that need to be solved for private DMs and group chats. Not all of these problems will be solved right away, just because the bar for creating a decentralized encrypted chat system on par (or better than) Signal is the kind of thing you get venture capital for.
So the question becomes, should we build a sub-par messaging system that is tightly integrated with nostr, or refer users to a better option for encrypted communication?
Well, why did Twitter create its own DM solution rather than integrating email? Was it to harvest user data, or to provide additional utility to their users by allowing the social graph to inform messaging? Native features simply create a better UX.
On the other hand, in Paul Miller's words:
I would like to remind that nostr DMs, if nostr gets a traction, would be used by all kinds of people, including dissidents in dangerous places. Would you be willing to risk someone getting killed just because you want to keep backwards compatibility with bad encryption?
There is no easy answer, but it's clear that as we continue to work on this problem it is very important that applications communicate clearly the privacy implications of all "private" features built on nostr.
So, assuming we can pull off an appropriate balance between privacy and convenience, let me outline three new developments in this area that I'm excited about.
Gift wrap
Way back in April, Kieran proposed a dumb, but effective, way of hiding metadata on DMs. The technique involved simply encrypting a regular nostr event and putting that in another event's content
field. The neat thing about this is that not only does this hide some of the most revealing event metadata, it's not tied to a particular event kind — anything can be wrapped.
What this means is that we can build entire "nostr within nostr" sub-networks based on encryption — some between two users, others based on a shared group key. This is much more powerful than a messaging-specific encryption schema. Now we can make private reactions, private calendars, private client recommendations, private reviews. I find this very exciting!
This proposal was further developed by Vitor — the current version can be found here. The latest version introduces double-wrapping, which can help prevent the wrapped event from being leaked by stripping its signature — that "draft" event is encrypted, wrapped, and signed to verify authorship without revealing contents, and then that wrapper is in turn wrapped and signed with a burner key to avoid revealing the author.
Wrapping also makes it possible to randomize the created timestamp to avoid timing analysis, as well as to add padding to a message, reducing the possibility of payload size analysis.
New DMs and small groups
The main subject of Vitor's proposal was a new way to do DMs and small groups using double-wrapping to hide metadata. This works basically the same way NIP 04 did — a shared key is derived and the message is encrypted for the recipient, whose pubkey is revealed publicly so that it can be delivered.
However, Vitor was able to extend this to multiple recipients simply by wrapping the message multiple times, with multiple shared secrets. With the addition of some metadata on the inner event, this makes it possible to define a group where the members know who each other member is, but no one else does.
Large groups
This proposal has one important limitation though — the number of events that need to be signed and broadcast is equal to number of messages * number of members
. This is fine with 5, 10, or even 100 members, but becomes infeasible when you reach groups of 1000+ members.
For that, we have two similar proposals, one by @simulx, and one by myself. The differences between the two are not important for this discussion — both use gift wrap as a way to exchange and rotate shared group keys, and both support a weak form of forward secrecy.
As a side note, this is the proposal I'm most excited about, and why I got into nostr in the first place. Private community groups which can support shared calendars, marketplaces, notes, etc., are what I'm most excited to see on nostr.
Shortcomings
The above proposals are great, but of course leave much to be desired. If these are built, nostr will be hugely improved in both security and functionality, with the following weaknesses:
- Deliverability is still an open question — are relay selections enough to ensure recipients get their messages, and no one else does?
- There is no way to create a group whose members cannot dox each other. You must always be able to trust the people you share a group with, and of course, the larger the group, the more people there are that can betray you.
- Content can always be shared outside a group, although in Vitor's small group proposal, the only way to verifiably leak messages would be to reveal your private key.
- Key rotation is not supported by the small groups proposal, but it may be possible to solve it for the entire protocol by building out a name resolution system for nostr.
- Forward secrecy for large groups is weak at best. A key rotation can be triggered at any time, including when a member leaves the group, but unless all relays and all members destroy the key rotation events, it may be possible to decrypt messages after the fact.
- Credential sharing still has to be bootstrapped, meaning at least one encrypted message for each participant may be publicly seen. Of course, credential sharing could also be done out of band, which would largely solve this problem.
What about SimpleX?
Lots of people, including Semisol and Will, have expressed interest in integrating nostr with SimpleX and using their cryptography and transport instead of nostr's.
The first question to answer though, is which protocol should we use? SimpleX is split into a few different pieces: there's the SimpleX Messaging Protocol, which describes setting up unidirectional channels, the SMP Agent Protocol which describes how to convert two unidirectional channels into a bidirectional channel, and the SimpleX Chat Protocol which describes a standard way to send chat messages across a bidirectional channel.
SimpleX Chat Protocol
So far, most proposals to fix messaging have focused simply on DMs, and sometimes on group chat. This is SimpleX's wheelhouse, so it could definitely make sense to communicate that kind of information over SimpleX's top-level chat protocol, and leave nostr out of it entirely.
The downside of this is that it requires clients to support two different protocols, and connect users to two categories of servers. This isn't a deal breaker, but it does increase the complexity involved in writing a full-featured nostr client, especially since there are no implementations of the SimpleX protocol other than (17k lines of) Haskell.
Using SimpleX's chat protocol would also limit the message types we'd be able to encrypt. Anything outside SimpleX's protocol (like zaps for example) would have to be sent using a custom namespace like nostr.kind.1234
, which wouldn't be recognized by normal SimpleX clients. There are also areas of overlap where both protocols describe how to do something, for example profile data, requiring clients to choose which kind to support, or send both.
Another wrinkle is that the SimpleX protocol is AGPL v3 licensed, meaning proprietary software would not be able to interoperate, and MIT licensed products would have to abide by their license's terms.
SMP and Agent Protocols
The lower-level SMP and Agent protocols are more promising however, since their scope is narrower, and doesn't overlap as much with nostr's own core competencies. Both are quite simple, and very prescriptive, which should make it easy to create an implementation.
SMP is actually quite similar to some of the proposals mentioned above. Some of the same primitives are used (including XChaCha20 for encryption), and SMP is transport agnostic meaning it could be implemented over websockets and supported by nostr relays. The network topology is basically the same as nostr, except SimpleX doesn't make any assumptions about which servers should be selected, so in that regard nostr actually has an advantage.
The main incompatibility is that SimpleX requires the use of ed25519 and ed448 EdDSA algorithms for signatures, which means nostr's native secp256k1 signature scheme wouldn't be compatible. But this could be ok, since keys shouldn't be re-used anyway, and there are other ways to share a nostr identity.
Forward secrecy?
Whenever a server is involved, several attack vectors inherently exist that would not in peer-to-peer systems. Servers can threaten user privacy by storing messages that should be deleted, analyzing user behavior to infer identity, and leaking messages that should otherwise be kept secret.
This has been used by some to suggest that nostr relays are not a good way to transport messages from one user to another. However, SMP has the same attack vector:
Simplex messaging server implementations MUST NOT create, store or send to any other servers:
- Logs of the client commands and transport connections in the production environment.
- History of deleted queues, retrieved or acknowledged messages (deleted queues MAY be stored temporarily as part of the queue persistence implementation).
- Snapshots of the database they use to store queues and messages (instead simplex messaging clients must manage redundancy by using more than one simplex messaging server). In-memory persistence is recommended.
- Any other information that may compromise privacy or forward secrecy of communication between clients using simplex messaging servers.
So whether SimpleX servers or nostr relays are used, privacy guarantees are severely weakened if the server is not trustworthy. To maintain complete privacy in either scheme, users have to deploy their own servers.
Relays to the rescue
Because nostr servers are currently more commonly self-hosted, and more configurable, this makes nostr the better candidate for channel hosting. Users can choose which relays they send messages to, and relay admins can tune relays to only accept messages from and serve messages to certain parties, delete messages after a certain time, and implement policies for what kinds of messages they decide to relay.
All that to say, SMP itself or a similar protocol could easily be built into nostr relays. Relays are not the wrong tool for the job, in fact they're exactly the right tool! This is great news for us, since it means SMP is applicable to our problem, and we can learn from it, even if we don't strictly implement the protocol.
Bootstrapping over nostr
One assumption SMP makes is that channels will be bootstrapped out of band, and provides no mechanism for an initial connection. This is to ensure that channels can't be correlated with user identities. Out of band coordination is definitely the gold standard here, but coordination could also be done over the public nostr network with minimal metadata leakage — only that "someone sent {pubkey} something".
Unidirectional channels
Unidirectional channels are a great primitive, which can be composed to solve various problems. Direct messages between two parties are an obvious use case, but the SimpleX Chat Protocol also includes an extension for groups, defined as a set of bi-directional SimpleX connections with other group members
.
This is similar to Vitor's small-group model, and likewise isn't conducive to supporting very large groups, since a copy of every message has to be sent to each group member.
However, it's possible to build large groups on top of unidirectional channels by sharing receiver information within the group and having all participants read from and write to a single channel. In fact, this is very similar to how our large group proposals work.
Conclusion
So, the takeaway here is that SimpleX's lower level protocols are great, and we should learn from them. It may be worth creating a NIP that describes the implementation of their unidirectional channels on nostr, and re-using that primitive in DM/group implementations, or simply following the same general process adapted to nostr.
I'm personally encouraged by reading SimpleX to see that we are headed in the right direction — although many of the details of our proposals have not been fully hammered out in the way SimpleX's has.
As for what the future holds, that's up to the builders. If you want to see tighter SimpleX integration, write a NIP and a reference implementation! I'll continue learning and fold in everything I can into my work going forward.