September 25, 2008 by alex
Monday was the first event at our new upstream office, starring @janl and me presenting an introduction to CouchDB - including a new hands on examples part - and afterwards an overview of what can be done with CouchDB in Ruby so far. The talks were followed by a discussion that gave (hopefully not only) me a couple of new insights I want to share here - after summarizing the evening for the people who couldn’t make it. There will also be a video recording with synchronized slides of both talks be available in the next days (thanks @klimpong).
CouchDB ist the new cool kid on the block. It’s a document oriented database that has replication built in, can scale massively and uses an HTTP REST interface to query it. Documents are stored as JSON constructs and can be queried with views that are built using Map-Reduce (a smaller company called google has had a bit of success with that recently). Oh and it’s written in Erlang. Jan has given a number of talks on numerous events already, so there are already a couple of <a href=”“http://mwrc2008.confreaks.com/10lehnardt.html>videos</a> and slides available - not from the hands on part though :) For that you should watch his blog I guess.
So as promised in the title of this post I am now going to concentrate on the need of people like me: people who have been using some sort of SQL based database for ages and things like ActiveRecord for not so many ages but enough to wash their brain to think in relations and joins and columns and all that. Btw here are my slides.
My talk starts with a couple of random thoughts (that are more clear to me now after talking them through with some smart people). With relational databases we used to have tables, columns, rows, joins and SQL queries. We were mapping complex objects onto flat tables and since that doesn’t work so well used multiple tables and helper tables to make it all happen. With CouchDB pretty much all that is gone. Instead of rows in tables with a fixed schema we now have schema-less documents that can store pretty much any kind of data structure. Instead of SQL with its known power and problems we now have those map-reduce views that work completely differently. On the one hand that means more freedom on how to do stuff, on the other it means more questions on how actually do it. To make a long story short: the essential problem here is to get rid of our relational mindset. That applies for designing databases as well as for designing tools and libraries for working with CouchDB. (apparently a book called document design will help with this, can’t find it on amazon though) Which leads me to the next topic of my talk.
I have picked three frameworks that are currently available as open source: CouchRest, ActiveCouch and RelaxDB.
CouchREST is a thin wrapper that provides a simple API for getting, putting, posting and deleting JSON documents to and from the CouchDB HTTP interface. I don’t actually have much more to say about it. It works, it doesn’t do a lot and that’s all great. Use it. No objections from my side.
ActiveCouch is - as the name suggests - inspired by ActiveRecord. It provides an ActiveCouch::Base class you can inherit from. After that you have to define the properties you want to store (you can’t obviously get those from the database schema anymore as with ActiveRecord) and then you can start creating, saving and finding objects. They have also integrated migrations that allow you to create (very simple) views using a ruby syntax. Without going too much into detail I think these guys haven’t come so very far in moving from the relational to the document mindset. CouchDB just doesn’t fit very well with migrations and records. Plus there is hardly any documentation to find about it at all. Sorry, not going to use it as it is now.
RelaxDB at first looks very similar to ActiveCouch. Inherit from base class, define properties, create, save, find by id. But: they don’t have migrations (one step futher away from SQL, one step closer to CouchDB). Instead they automatically generate a view when you call a find method for the first time. Sounds like a good idea to me. RelaxDB also supports basic associtations, so you can declare a has_many relationship between two classes and they get stored in separate documents. Definitely useful but still the same as ActiveRecord.
CouchPotatoe now is my own take on a Ruby persistence layer for CouchDB. Since I only have been coding on anything CouchDB related for 3 days I am not claiming for it to be anything close to useful or better than any other framework. For now its main purpose is to help me learn Couch (as we long time CouchDBers say ;) ) and its principles. But that might change in the future, who knows…
So anyway, before letting you into the huge set of features let me stress this again: the problem to solve here is not to clone ActiveRecord to work with Couch (we don’t have records anymore, we have documents) but to come up with new concepts that help writing apps on top of documents.
So the first and currently only features (the CouchPotatoe::Persistence Module) doesn’t actually do much else than the other frameworks (but a bit :) ). It converts a Ruby object into JSON and stores it into CouchDB and loads and converts it back into a Ruby object.
The difference between a record and a document is that a record consists of a flat list of attributes while a document can have a deeper structure, e.g. a tree, which can have lists or subtrees and all that. Potatoe (as we long time CouchPotatoe devs say) helps you …. (damnit, i’m just reading on wikipedia that “Potatoe” is actually considered incorrectly spelled and it should be “Potato” … oh well) … to build these more complex objects by providing a couple of helpers. The first is called property
and lets you simply define an attribute of any value and type that then gets stored as is into Couch. The second is called has_many
. Although that might sound familiar it’s not the same as the ActiveRecord version. With has_many
you can chose to either store a list of sub objects inline into you document or as separate documents. In both cases everything is stored and retrieved automagically for you. So you can either do sort of join multiple documents like you would in a relational database or you can create more complex documents. I’m planning to add more of those helpers as time and ideas progress (trees, inline versioned attributes… if you have any ideas please tell me).
The second feature that deals with retrieving stuff. It’s only a concept for now so nothing is implemented yet. During the workshop the idea came up that instead of providing some magic finder methods like in ActiveRecord to call and generate views, those should be handled as first class objects. I think I have only scratched the surface of what you can do with Couch views but I have heard people who know what they are talking about use the word crazy to describe it. So basically that sounds just like an idea very well worth trying and I’ll implement something along the lines of mapping a Ruby class to a view and provide helpers to generate, update and query that view. Whatever it may look like.
As soon as those basics are implemented I will put the project up on my GitHub account and that hope to be able to raise a bit of interest. So be patient a couple of weeks maybe. So long, I’ll be posting updates here when I have more to tell.