Aether grows past 500 concurrent nodes
It’s been a pretty fun week so far.
After I posted Aether for Redditors, Aether ended up on the front page of Hacker News again. (Link)
The drill is familiar by now. Just keep everything working, and it’ll pass in a couple days. This time around, I’ve gotten about 6,600 unique visitors on the main site or so — which is normal, since the HN link was towards the blog, and not the main site.
There’s one more benefit in these things — it pushes the system to the next order of magnitude scaling, and it gives you a glimpse of where the next set of scaling issues are going to come from. This is indeed what happened when Aether broke through 500 concurrent nodes online. If you’ve been on the network for the past few days, you might have been having some trouble seeing new posts from other people. I’m writing to give a little bit of an educated guess on why, and what I’m doing to improve things. The stuff I’ve mentioned in this one is going to come in an update in the next few days or so.
In general, this is a great position to be in — Aether is picking up steam at a pretty steep pace, so when these 10x whale events happen, it’s always insightful. The reason I can be working these issues right now, early as possible, is because you guys came in all at the same time and stressed out the algorithm that chooses which nodes to connect to. So thanks for being here and using it. 🙂
What I thought it was
- I had released an update one day earlier, and I thought that was the cause for the connectivity problems. So far as I can see, it wasn’t. I reverted that update, and it seems it’s pretty much still the same.
- There are some users in Windows where because of some sort of antivirus or some other external issue, the app is not able to write cache files. After some debugging with some folks at the meta forum we’ve ruled that out as the culprit. (Thanks!)
What was happening
- Aether is a flood network. For this to work effectively, every node has to choose who to connect to at any point in time. These connections happen every minute or two. These are all outbound connections.
- You also receive inbound connections. There is a limit on how many inbound connections you can receive simultaneously at any one time, to respect your internet connection and not to tax your machine. In essence, your node has a defined number of ‘slots’ that other nodes can connect to. When your node runs out of slots, it starts to tell other nodes: “I’m too busy, connect back later”.
- An Aether node, to determine which nodes to connect to, does a 'network scan’. This is, in very simple terms, a process to check some nodes that have the highest likelihood of being online.
- This network scan uses the first two steps of a full sync. So it’s much lighter than a regular sync. It’s just a ping.
- Unfortunately, what happens is that these pings also take a slot. These slots take 60 seconds to clear, to allow for continued requests on the same slot if needed.
- When we had fewer active nodes, network scans and regular syncs were both able to fit the given number of slots at any point in time.
- When we came close to the 500 concurrent nodes, that was no longer true. Since syncs hit one node, but scans hit multiple nodes, the scan traffic grew much faster than the sync traffic, clogging the slots that were actually meant for syncs. This is why the updates slowed down.
- The sync logic was a little too soft: when it failed a small number of times, it stopped trying to sync in that cycle, and left it for a future cycle. Since the large majority of the nodes in the network were saturated by scans, this meant it kept sitting around a lot. This was a means to reduce the number of scans, since every sync attempt would mean a new scan, but combined with other effects mentioned above, it backed itself into a corner.
What’s the fix?
Bunch of things.
- The scans will be rate-limited. Now a node can do one scan every ten minutes, instead of being able to trigger a scan every time it wants. This should reduce the slots used by scans. (They are a tiny portion of the traffic—since they’re essentially pings—but since they’re indistinguishable from a sync starting, they took a full slot.)
- There’ll be separate slots, only for scans. This makes it so that the syncs will never be blocked by scans.
- Sync logic is more aggressive, since it no longer implies a full scan for every retry. It will keep retrying with different nodes using the existing scan data from up to 10 minutes ago, until it can find one that it can sync with, or it exhausts the addresses database. The cooldown for each address is increased from 3 to 10 minutes, so at any point in time, a node will only hit another specific node only every 10 minutes, at most.
In short, we are separating the scans from the actual sync, and it will just be something that happens every 10 minutes, and no longer a service other parts of the app can call at any time they want. The goal is to make scans a little less chatty and dominant, since scanning every attempts vs every 10 minutes does not appreciably diminish the value of scans (far as preliminary testing shows).
This makes it so that the other parts of the app that relies on this addresses table can now be more aggressive, since they are released from having to update this table themselves.
Lastly, having separate, dedicated slots for scans makes it so that we can give 10x the slot count only for scans, since they are effectively close-to-free to respond to.
Why did it work before, and why it didn’t work when it grew?
Because the traffic used for scans grew faster than the traffic used for syncs, since syncs are 1:1, but scans are 1:N. So the scan requests grew to be larger than the network’s overall capacity growth via new nodes joining. The changes to rate-limit the scans and give them separate slows brings them back to a more linear growth with the network’s overall capacity as new nodes are added.
These changes involve some backend changes, therefore it’s going to be a few days, and I’ll continue to work on and improve the stability in the coming weeks as my work schedule on the business version allows. I’m writing to shed some light on what I’m doing and what’s happening behind the scenes.
You should see see a steady stream of updates coming in - the changelog will carry more details. They’ll be focused on improving the scalability of the system as it gets bigger.
Growing pains y'all. Cheers!
Aether for Redditors
Hey there!
Due to the recent news about Reddit, we’ve had a few redditors coming to check us out. Which is awesome, so I wanted to write a guide about how Aether compares to Reddit, and what it does similarly, and differently. Likely you’ll be fairly comfortable quick, but there are still a few interesting aspects of Aether you might want to keep in mind as you warm up.
We are a small, friendly community, consider this a welcome pack. 🙂
As I hear more and more questions from redditors in the community, this might be updated occasionally.
Aether is a peer-to-peer network
This is the most major, obvious difference. Aether has no servers. It exists … nowhere, really. As a result, Aether is an app, not a website. It’s available for Windows, Mac and Linux, and mobile apps are (eventually) coming.
This has a few implications. When you post on Aether, what happens is that your computer starts to share the content you posted. Other computers will get that content from you, and they will broadcast it to other computers, letting your post spread to the network in a sweeping fashion. As of February 2019, a post takes about ten minutes to reach the whole network.
That means Aether app is an app that needs to keep running in the background, like an email client.
If you post something, and then close the app immediately, that content will not be delivered to other people.
If you want to be sure your post is delivered, wait half an hour or so. Aether just stays on the taskbar / menubar (like Discord), so you can close the window and it’ll continue to work in the background. In the future this is going to be visible in the UI when your post has spread to the network. (Like double checkmark from messaging apps)
Aether is ephemeral (like Snapchat - things disappear eventually)
Anything you post on Aether will be gone in about 6 months. This is nice, because no one can stalk your decade’s worth of Reddit history and figure out where you sleep.
This is both a philosophical and a practical thing.
It is a philosophical thing, because having information gone vastly improves privacy. It also makes people be able to discuss more freely, without being concerned about whatever they wrote will bite them ten years into the future. We all grow up, and we were all less experienced when we were younger. Aether tries to respect the humanity in that by deleting too-old content.
But it is also a practical thing since it’s a peer to peer network, it is limited by the disk space of its participants, so we try to be respectful of that as well.
Unlike Reddit, in Aether, moderator actions are visible to users
First of all, before anything, no one can edit your posts except you. It is cryptographically impossible. You’d think no one would do that, but given the current climate, you’d (sadly) be wrong.
Beyond that, when a moderator takes an action (delete a comment, let’s say), that action is visible on the community’s mod actions feed. This is a feed of events that mods generate that shows exactly what got deleted, and the reason why.
You can disable any mod, and choose anyone as a mod
In Aether, if you don’t like what a mod is doing, you can just disable him. Flip a switch, and everything he deleted reappears. You can also choose a non-mod as a mod.
There is a ‘front page’ list of communities, called SFW list
These SFWlisted communities are the ones that appear on the front page. This is a limited, curated list of larger communities. You can always create your own community without ever needing to get into this front-page-eligible list of communities if you want. You can also disable this list completely by following the instructions in the app if you want.
Like Reddit, Aether is within the jurisdiction of the United States
That means US law applies — it is not a free-for-all. We have to remove copyrighted content via DMCA, as well as illegal (and those with reasonable chance of being illegal) content.
Aether keeps a copy of the whole network on your machine
This is why it can be so snappy: you can post offline, and when you connect, those posts will be spread to the network. The 'whole of the network’ is actually very small, because Aether only carries compressed text. It doesn’t carry images, videos, or anything else, so you need to post to Imgur or other image hosts, similar to Reddit.
This also emphasises the importance of no-illegal-content mentioned above. Since we all carry the text of the whole of the network, it’s in all our best interest to keep the network clean. It’s very hard to make text illegal, however, it’s up to all of us to keep it as such. If you see something illegal, use the report button, or send an email with a link to it.
(And yes, there are guards to prevent spammers from creating a million posts and bloating the network size, such as required proofs-of-work.)
Aether is a work-in-progress
Despite the UI, Aether is still very much a work in progress. There are parts of the app that are being worked on, such as elections, being able to add a second mod to a community (Aether communities are denoted as b/Community instead of r/Subreddit) and so on. Things will break, and perhaps repeatedly so. At this point (a month after its release in December) things generally work provided that you have a stable internet connection and can keep the app open appropriately. Nevertheless, this is alpha software. If you have any bugs or feature requests, file them at https://meta.getaether.net. b/Meta is also a good place. (The link requires you have Aether installed)
Like Reddit, you can link to Aether from the web
Here’s an example link:
aether://board/86e782e80681ac580b4d6d102b12e787c066e59f194fee57bb0bf83cc1e42fc6
(this links to b/Meta)
Notice aether:// instead of http:// at the beginning. As mentioned above, it needs the recipient to have Aether installed, though. We’ll eventually have a preview site on the regular web that can show content without needing it installed, but again, work in progress. 🙂
If you want to post on Reddit or Twitter, and have it be recognised as a link, you can shorten the links at TinyURL, which accepts and shortens Aether links. Or if the place supports Markdown, like Reddit, you can always do:
[my link name](aether://board/86e782...e42fc6)
And make it show up as a link that way.
Aether is paid for by the 'unique’ (orange) usernames and its business version
Since the current conversation is around how Reddit is funded, I want to be completely transparent about how Aether makes money (it makes very, very little money) as well. Here’s how this works.
a) Similar to Reddit’s gold, if you want to support Aether, you can buy a 'unique’ username (with a checkmark, like Twitter) that makes you the unique owner of that username for the donation duration. If you want to do so, check out the Patreon.
b) Aether also has an upcoming business version, that allows a company to purchase a private instance of Aether for their own internal use. This comes with a few nice additions, like being able to use email to create threads and posts, and get emails back when other people post. It’s good as a productivity, tool, and it’s much better than Slack because it interrupts your people less. If you’re a tech lead and interested in piloting this with your team, please reach out via email and we’ll set you up.
Sounds interesting? Try Aether here. Hope to see you around!