Emojitracker was one of those projects that was supposed to be a quick weekend hack but turned into an all-consuming project that ate up my nights for months. Since its launch in early July, Emojitracker has processed over 1.8 billion tweets, and has been mentioned in approximately a gajillion online publications.
Emojitracker wasn’t my first megaproject, but it is definitely the most complex architecturally.
While the source code for emojitracker has been open-source since day one, the technical concepts are complex and varied, and the parts of the code that are interesting are not necessarily obvious from browsing the code. Thus, rather than a tutorial, in this post I intend to write about theprocess of building emojitracker: the problems I encountered, and how I got around them.
This is a bit of a brain dump, but my hope is it will be useful to others attempting to do work in these topic areas. I have benefited greatly from the collective wisdom of others in the open-source community, and thus always want to try to do my best to contribute back domain knowledge into the commons.
This post is long, and is primarily intended for a technical audience. It details the origin story and ideas for emojitracker, the backend architecture in detail, frontend client issues with displaying emoji and high-frequency display updates, and the techniques and tools used to monitor and scale a multiplexed real-time data streaming service across dozens of servers with tens of millions of streams per day (on a hobby project when you don’t have any advance warning!).