Liz Douglass

MongoDB

with 2 comments

Recently I started on a project that is using some interesting technologies including Scala, MongoDB and Django. Some are quite new to me and I’ve learnt a great deal. Here are some observations of the things I’ve learnt.

MongoDB

Mongo is a schema-less document-oriented database that stores data in binary encoded JSON documents – BSON documents. The online documentation is quite good and there is a good tutorial and an online interactive shell.

How have we been using Mongo so far?

We have two projects that interact with with Mongo; one is a RESTful API back-end and the other is a tool for populating a Mongo database with data from a MySQL database. Both these projects are written in Scala and use the Mongo Java API.

Why Mongo?

– The JSON-like documents allows us to store data about in a way that is obvious because they read like plain English.  We need to store information about the users of our system. In nearly all cases, the information available is different for each person – some people have no contact phone numbers, others have 6 children. Mongo has allowed us to build up profiles for our users and only include the pieces of information that we actually have available for them.

– The schema-less and denormalised nature of the database means that we can modify the structure frequently. We only started this project a couple of weeks ago and have already made several quite large changes. These include how we organise the Mongo documents that we’ve been generating from a legacy MySQL database. This sort of flexibility is fantastic, especially at the beginning of a new project.

Populating the Mongo database

Our data migration project extracts data for each record in the MySQL database using a SQL query. This data is then used to create a Person domain object, which is composed of microtypes like the one below. The Person type, as well as all of these types, implement the ConvertableToMongo trait:

class NextOfKin(val relationship: Relationship, val person: Person) extends ConvertableToMongo {
  def toMongoObject: DBObject = {
    new BasicDBObject(Map(
      "relationship" -> relationship.toMongoObject,
      "person" -> person.toMongoObject).asJava)
}

where:

class Relationship(val description: String) extends ConvertableToMongo {
  def toMongoObject(): DBObject = {
    new BasicDBObject(Map("relationship" -> description).asJava)
  }
}

ConvertableToMongo is a trait:

trait ConvertableToMongo {
  def toMongoObject: DBObject
}

Note that we need to use the asJava method from the scala-javautils library convert the Scala map to the requisite Java map required by the Mongo API.

The ConvertableToMongo trait has a single method that returns a Mongo DBObject. These are inserted into a Mongo collection like this:

val usersCollection = mongo.getCollection("user")
members foreach(user  => usersCollection insert(user toMongoObject))

The end result is a Mongo document like this one:

{
	"_id" : ObjectId("4c29f7fdbe924173a47a759f"),
	"firstName" : "Joe",
	"surname" : "Bloggs",
	"gender" : "Male",
	"nextOfKin" : {
		"relationship" : "Son",
		"person" : {
			"name" : "John Bloggs"
		},
	},
}

Note that unless specified every document added to a Mongo collection will automatically be assigned an ObjectId with the key _id.

Advertisements

Written by lizdouglass

July 19, 2010 at 8:31 am

Posted in Uncategorized

Tagged with

2 Responses

Subscribe to comments with RSS.

  1. Rather exciting to see that you guys are using Scala in production!

    When you decide to change the structure, are you migrating any existing data from the old mongo structure to the new one, or are you saved by “no production data yet?” If you are migrating, how are you going about it?

    davcamer

    October 11, 2010 at 7:49 pm

  2. Hi Dave, apologies for not replying sooner.

    You’re right in saying that we are somewhat saved by not being in production as yet. The structure of the Mongo documents has changed several times already and in all cases so far we’ve made alterations in the data migration project and then regenerated all the collections from the MySQL extracts. Obviously we won’t be doing this once we’re in production.

    We do have a whole other project that uses Quartz to regularly run maintenance type operations, such as expiring lapsed subscriptions. This application connects to the same MongoDB instance as the main webapp. Some of the operations performed do alter whole sections of Mongo documents – e.g. the “subscription” block of a users document. I can image that in some cases we may need to change the structure more gradually. In those cases I think it’s plausible to write a new document substructure into an attribute/field with a different name to that of the existing substructure. Any querying code would have to be able to handle documents having the new or the old field, or in some cases both the new and old fields. We could have a maintenance job to remove the old substructure for people that have both and thereby slowly phase out the old substructure.

    lizdouglass

    December 15, 2010 at 5:29 pm


Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: