Tuesday 22 October 2013

Welcome

A Series of Blogs on MongoDB and Python

This is a blog about MongoDB and Python. I'm using them for a new project and I'll document what I learn as I go. MongoDB is a NoSQL database. An unfortunate set of naming decisions, but there you go. NoSQL doesn't even mean No SQL, it means not relational - an alternative to relational database models.

The MongoDB model involves storing data as collections of documents. There are no tables - one difference with the relational model. The definition of a document, while specific in the technical sense, is more general than the English language definition. It needn't be a written document, such as a letter or article or blog entry (though it could be). A document can contain data for any attributes, and each document in a collection can store data about different attributes from other documents in the same collection. Documents are stored as binary JSON objects (BSON), but we'll worry about that later.

For example, we might store details of people:

People
Name:Tom, Gender:Male
Name:Sally, Gender:Female

Now Tom and Sally are documents in the collection, People. So far, so like a relational table, except we can add more data such as:

Name:Jim, Age 25

Note, we don't store Jim's gender but we do store a new attribute - Age. This can be difficult in a relational table where we need to define all the fields at the start (or keep adding fields to every row each time a new one is needed, which is not very nice).

Python has a data structure called a dictionary, which is essentially an associative array. Dictionary objects can be populated from JSON objects, and MongoDB queries in Python can return dictionary objects, so it all works very nicely together.

In the next blog, I'll get all the software we need installed.


No comments:

Post a Comment