Aether v1.1 is here!

​March 15th, 2014, 23.48

After a radio silence, Aether is back in town!

This version brings significant improvements to all steps of the network lifecycle and solves all visible cases of performance issues experienced in prior versions. The networking engine is written from scratch, so is the database read / write system. It also includes significant user interface improvements based on user feedback.

This is pretty much a new, much more stable app that can communicate with orders of magnitude more clients without creating as much strain. This is quite possibly the first non–alpha version of Aether.

This release, however, is not packaged for release on Windows, Mac OS or Linux yet. The reason is at the bottom of this article. tl;dr: I need your help fixing a bug.

Changes & Improvements

Backend

Database

The old engine spawned threads for every single network action (a message arriving over from the network, for example) under the assumption that Python’s threading overhead was minimal. Additionally, considering much of the work being done in the created thread consisted of handing it over to native C SQL binaries of SqlAlchemy which can release the Global Interpreter Lock, I thought threading wouldn’t be a bottleneck. I could not be more wrong.

It also created issues with database consistency. I did checks at every possible location, but if two threads were committing the same post that came across from two different users, they could not see each other, and I would end up with a duplicate record.

Moreover, the lag caused by thread switching and the Interpreter Lock struggling to keep up killed performance of the user interface. The UI relied on the control being returned to it at least 60 times per second (60fps) plus its own delays for Javascript interpreting for smooth animations. I was, in essence, running an interpreter (V8, Javascript) inside another (Python).

These have been main causes of performance problems experienced under strain. The thread count ballooned like beautiful, horrifying fractal flowers blooming in the eastern sky.

How it works now

I got rid of all threading I can. The new approach is based on the idea of delayed commits; instead of trying to move all data arriving over from the network immediately into the database (and incurring the duplication check overhead at every hit), Aether now pools the data into a cache. This cache commits itself onto the database at regular intervals. When a commit cycle hits, the cache first does a consistency check within itself, to make sure if it does not contain duplicate entries. It then does the consistency check against the database, and only after fixing potential issues it writes data to the database.

This has a few benefits: the most obvious one is that the thread count is reduced from thousands to just one. Doing consistency checks together also reduces the performance impact, and since all of this is happening either in memory (in-cache check), thus imperceptibly fast, or against the filesystem (where SqlAlchemy C binaries can release the GIL to let Python interpreter continue working) it functions without blocking the main event loop.

It also reduces complexity in other parts of the application, as they are no longer concerned with whether the arriving piece of information is an update, new data, or an exact copy of what it has. Also gone is the requirement to handle consistency issues.

Interface Performance

As I briefly mentioned above, the UI suffered from locking and resource starvation in the backend. While these issues are ‘fixed’ in that it works well on a regular computer without stressing it or spinning up the fans, given a sufficiently low powered computer or sufficiently many network events, they are bound to surface again. Running the GUI in the networking event loop is inherently a suboptimal solution.

How it works now

Aether’s interface now works in an entirely separate process, without any dependency to the networking engine. This is neat in both ways: GUI is now free of the requirement to return control just to keep the application running, and the networking core is free of UI performance concerns, and can stop as much time as required on cycles, safe in the knowledge that the user can continue using the application even while the networking engine is busy.

Frontend

User Interface & Experience

The interface of the 1.0 was based on conceptual purity, which allowed me to represent core concepts of the network and backend architecture on the interface. It made sense, if you knew what was going on. That wasn’t a decision made out of a belief in such a necessity, or even that it would be good user experience, rather, I designed it that way because it allowed me to iterate as fast as possible without trying to juggle two separate abstractions.

That meant it wasn’t optimised for a regular user who might not necessarily have prior knowledge of forums or other boards such as Reddit, Slashdot, etc.

Aether is a bit more mature now. Since Aether’s inception, I had time to get a lot of feedback about the interface, and I am happy to report I fixed—hopefully—most of the more egregious problems. There is still a long way to go, especially on features such as implicit saves, so when you navigate away while writing a post, your post doesn’t disappear, or a back button, which involves much more complexity than it appears on the surface, but in general, I believe this is a much more sensible interface, and it’s as pretty. Of course I welcome all feedback, let me know what you think.

Protocol

Aether v1.1 still uses Aether Protocol v1.0. After some deliberation I decided not to upgrade the protocol itself for this version, since my plate is a bit full with improving the app itself for the moment. AP v1.1 will introduce third party application compatibility with other Aether clients, so you can write your own application using Aether protocol, which will be able to talk to any and all other Aether clients.

For this, I will launch a separate site, which will document the entirety of the protocol and example responses. In hindsight, maybe naming the application (Aether) and the protocol (Aether Protocol) the same wasn’t that brilliant an idea…

This is where I need your help.

It runs well if you have the dependencies installed on your machine. Just run main.py and it should work.

However, I can’t package this thing.

Ridiculous as it sounds, my new dependency, Ampoule, which I am using to split the application into two processes (networking core and GUI) is proving to be very hard to package with PyInstaller.

What happens

When I package the application itself and run the packaged app, it starts to spawn itself infinite amount of times. This is because the GUI process is crashing and Ampoule automatically restarts crashing processes.

Clues

If you run Aether 1.1 unpackaged, you’ll see that it logs to Logs directory in the application folder. When packaged, nothing is logged from the GUI process. It’s crashing before logging system fires. Since Twisted’s logging is the first thing to load after imports, this points to import errors.

Considering the way Ampoule works is invoking another Python interpreter and instantiating a two-way communication bridge over file descriptors, I guess the problem lies in the fact that when packaged, Ampoule tries to import the system Python, not the packaged one which has all the dependencies loaded. However, I have no proof of this (no logs, no exceptions, remember), and I have no idea how to fix this. It might also be an entirely different issue.

What you can do

Please get this thing to package. Here is the PyInstaller spec file. This is a complex problem with many moving parts—if you think you’re good at Python, here’s your challenge!

Why aren’t you doing this yourself?

I have tried. This version was actually completed two weeks ago, and from then on I was hopelessly attempting to debug this issue in my spare time, which is reduced to nil since I started working for Google at the beginning of March. I will continue developing Aether, but I no longer have hundreds of hours to sink into debugging this.

Dependencies

To run the application you receive from the Github repository, you need Twisted, Ampoule, ujson, SqlAlchemy, jsondate, miniupnpc, pyOpenSSL, OpenSSL, termcolor, PyQt, and Qt. Download and install Qt first, compile PyQt against your version (I am using Qt 5.1), then get OpenSSL according to the guide given at pyOpenSSL documentation. Miniupnpc, you need to get it from the project’s site. The default package also includes a Python module, just install that. The rest is a pip away. Aether should run equally well on Windows, Mac and Linux (Debian variants). Windows and Mac should work as is, but some Linux shims are incomplete (i.e. where to save user profile files, etc.) so you might need to fix a few minor annoyances like that. I haven’t yet released an official Linux version, hence the incomplete state.

You’ll need PyInstaller to package this, if you want to help me. If you can get any other tool to package it (py2app etc.) in a cross–platform way (i.e. the tool you’re using can produce Windows, Linux and Mac OS packages) that’s also perfectly fine. I would recommend getting it to run unpackaged before attempting to package it yourself. Even if you can’t get to package it, any clues you end up having would be immensely helpful.

Thank you!

Burak