In my last post I promised to talk a little about the technology that underlies Urbantastic. It’s not the usual suspects, so it’s worth some explanation.
Warning: Here be severe geekery.
Urbantastic is a typical web 2.0 site in a lot of ways: users posting messages to each other has been done before, certainly. But in the main it’s completely new.
The site shares certain aspects with an online forum, but has a very different set of features - crowdsourcing and micro-volunteering to name two. Also there’s similarities to Facebook, but Urbantastic is not p-to-p, but p-to-“organizations-doing-great-things” instead.
It became quickly obvious that off-the-shelf CMS systems weren’t going to cut it, nor were the various open source social networking projects. So we decided to build a whole new website.
Framework or Not?
What would power Urbantastic? It was down to two criteria:
- I need technology that will make the best use of every minute I have. We’re trying to make a big impact with tiny resources: I’m the only coder, and I only get to spend about half of my time coding. ROI is key, even if it means taking risks and learning new ways of doing things.
- I have to like what I’m working with. I’ll be working on the guts of this site for years to come, and for morale reasons it’s a big win if it’s a pleasant place to be.
Right now the sites like ours typically get made with web frameworks: Ruby+Rails, Python+Django, PHP+Symfony. My experience with frameworks is that the first 90% is breezy, the final 10% a horror.
I have a strong belief in treating users very well. In my experience this invariable comes down to breaking abstractions. Humans, and their expectations of how things should work, are impossible to fully abstract. There’s always going to be one button on your site that acts completely different than the rest, or there’s going to be one page that absolutely shouldn’t reload, but that’s the way the framework wants to do it.
So you either hack the framework, or you end up recreating part of it, poorly. In either case you have this awful (and typically expanding) pile of code which you can’t send upstream because it’s too specific, and breaks with every new revision of the framework.
Of course the alternative is to create your own framework (intentionally or not). This is usually a worse option. You don’t get free bug fixes and plugins, it takes ages to recreate all the functionality, and instead of tens or hundreds of developers putting in polish and rounding off edges you’ve got just one.
It’s a classic rookie mistake: instead of taking the time to learn how it’s been done elsewhere, you arrogantly start building your own little castle.
And yet, it’s the second route that I’ve chosen. The reason is that when you’re doing things in a genuinely new way, it’s your only option.
How things fit together
Change in technology, the good stuff, generally comes from two places: the first is when an existing technology gets better in a linear way. The second is when new technology arises out of a changed set of circumstances.
The template for the current web frameworks have been around long enough that I don’t think I can add much more in that direction. However, I do believe that there’s been a fundamental shift that will support a better solution: serious client-side scripting is now possible. It’s a bigger change than most realize.
In most frameworks, dynamic data and static HTML templates are first combined on the server and then sent to the client. However, now that Javascript is usable, performing the combination on the client side is a very attractive alternative.
All the HTML in Urbantastic is completely static. All dynamic data is sent via AJAX in JSON format and then combined with the HTML using Javascript. Put another way, the server software for Urbantastic produces and consumes JSON exclusively. HTML, CSS, Javascript, and images are all sent via a different service (a vanilla Nginx server).
Why is this a good thing? First, it makes one part of scalability trivial. Static content is cached at several levels, and can be limitlessly replicated. I can drop the vast majority of all the data we send out onto a CDN like Akamai on a moment’s notice at their cheapest rate; purely-static load balancing is a god-send once a single server starts to show strain. Also, caching makes for a much faster user experience: after the first load the only data ever sent is purely new information.
Further reduction of load on the server happens because a good chunk of computation has been pushed out to the client. Instead of seeing the server and client as separate things, I like to think of a website as being run on an ad-hoc cluster of computers, with the server coordinating the client and CDN nodes.
Another benefit comes from the fact that web browsers are not the only clients that will use Urbantastic. Mobile devices, search engine spiders, screen readers for people with disabilities, and RSS readers all need the same data but in different forms. Accommodating any of these is simply a matter of dropping a different rendering front-end in front of the common JSON data server.
Finally, the complete separation of static and dynamic elements of the site makes everything a lot easier to reason about. That’s a bigger topic than I have space for, but fans of functional languages will know what I’m talking about.
The Language
Another, more concrete, reason for not using a framework is that there aren’t any mature ones for the language that I’ve chosen (sorry Compojure). Clojure is still very new.
Why would I be using such an odd language? The entire blame rests with Paul Graham. He writes great essays, and a lot of them (when I started reading them) were about Lisp.
Lisp is an old language: it’s the second only to Fortran amongst languages still in use. For many and various reasons it’s never been very popular. But it’s hung around for a reason - it’s an elegant weapon, for a more civilized age. Yeah, I just pulled out a Star Wars metaphor. It’s that good.
Lisp is a language from the time of mathematicians, not code monkeys. Like Haskell, it takes brainpower to use the language properly, but also rewards deep thought disproportionately. Don’t get me wrong - I’m no elitist. I’m happy that more and more people are learning to code; I think it should be considered a basic aspect of literacy. But once you’ve written enough code in enough languages you start craving certain things.
The task of excellent programming is in making things simpler. Simple is easier to maintain, it’s quicker to write, it feels better, and it makes you happier. Every language I’ve ever used lets you simplify to a point, but no further. Lisp has a point like this, but it is far distant from the others. Problems that start out as a big mess of complexity get smaller and smaller until they nearly wink out. It’s a beautiful feeling.
I’m just one person. I can’t maintain a lot of code; so I’m planning not on writing much of it. Lisp lets me do that. So a language like Clojure (which is a Lisp dialect) satisfies both of my prime criteria of power and pleasantness.
The Database
The final unusual part of the setup at Urbantastic is the database. The site originally ran on Google App Engine, which in theory should have been perfect - why we left them is a whole other blog post. But this history left me with a database structure that didn’t fit well into an SQL RDBMS. A search for a more BigTable-like replacement quickly led to CouchDB.
In my mind this is the highest-risk component of the technology strategy. The codebase has just left alpha, and the debate about the relative merits of relational vs. document-oriented database has just begun. There’s no way to know where the pitfalls are going to be as the site gets larger and the features pile in.
Theoretically CouchDB is a perfect companion to the static/dynamic split that I’ve discussed above. It’s well and good to have your HTML served from all over, but having it all access just one server can get you into trouble quick - it’s like DDOSing yourself. One of the most talked-about aspects of CouchDB is replication, which is the way to handle this.
The complexity of the replication schemes in the more traditional databases have always sat wrong with me. The way that CouchDB does it hasn’t proven out yet, but its simple obviousness is enough to give me great hope.
What I know now is that it’s been an extremely pleasant experience. It’s taken some brain-twisting to fit into how things are done, but the simplicity of the whole setup is very appealing. Finally, the intuitiveness of determining the cost of any operation is well worth the price of admission - every read is just an indexed lookup, and the cost of writes - while more complex - is equally understandable.
In any case, the backup plan is to put a relational db alongside CouchDB if the need ever arises. But I’m beginning to suspect it won’t.
And with that, we finish our tour of the inner workings of Urbantastic.
Good Technology and Bad Movies
The image I have of the technical side of my job is that of the sailboat from Water World. Awful movie, beautiful boat. When it’s just you on an 70 foot trimaran, you need to rely much more than usual on technology to make sure things keep going right. The protagonist had all sorts of custom winches, harnesses and levers to perform the work of several people.
It was a weird boat. But it did far more with only one person at the helm than it should have been able to. Only time will tell, but I hope to say the same thing about Urbantastic one day.
Heath