Monday, 2 December 2013

Python for MongoDB

Python For MongoDB

This isn't a tutorial on programming Python. If you're new to the language, there are plenty of online resources to get you going. Many are listed here. I've already covered installing Python and the MongoDB package PyMongo, so what are we doing here? Well, two things:
  • A quick look at the bits of Python that are useful for use with MongoDB 
  • Some examples of connecting and manipulating a MongoDB from Python

JSON, dict and find()

MongoDB stores data in binary JSON format. JSON, if you haven't seen it, is a format for storing data in a human readable form. There is a nice quick introduction to it here. You don't need any particular software to use it, though it is supported in pretty much any programming language you might pick. JSON objects are defined as either name:value pairs or arrays. The value part of a name:value pair can be an array or a set of name:value pairs and an array can contain name:value pairs. Some examples will clarify:

Here we define a single object and its properties:

{"Name":"Tom","Animal":"Cat","Activities":["Chasing Mice","Exploding"]}

The object is enclosed in curly braces {}, the name:value pairs are enclosed in double quotes "" and the array is enclosed in square brackets []. Values can be objects themselves:

{"Cartoon":"Roadrunner", "Main Character":{"Name":"Roadrunner","Speed":"Fast"}}

Values can also be arrays of objects:

{"Cartoon":"Roadrunner", "Character":[{"Name":"Roadrunner","Speed":"Fast"},{"Name":"Wiley Coyote","Speed":"Slow"]}

JSON supports a number of data types, not just strings as used above. They are string, number and boolean. Objects and arrays are also data types, and there is a null type, which means the value is empty.

Okay, so what? Well, Python has a data structure known as a dictionary, which pretty much mimics the JSON format.

Let's switch to Python now. Run your chosen Python program (I suggested PythonWin for Windows, and that is what I'll use here). Most offer an interactive window for testing short bits of code and that is what we will use.

You can define a dictionary like this:

toon = {"Name":"Tom","Animal":"Cat","Activities":["Chasing Mice","Exploding"]}

There is also the dict() constructor, which takes an array of key value pairs:

toon = dict([("Name","Tom"),("Animal","Cat"),("Activities",["Chasing Mice","Exploding"])])

And you can extract named elements like this:

toon["Name"]

which would return "Tom". There are other useful things you can do with a dictionary - some are explained here.

Finally, we stay with Python, but start using PyMongo to interact with the database. First, we need to import the PyMongo library and get a connection to the database:


import pymongo# Import the pymongo library
from pymongo import MongoClient# and the MongoClient
client = MongoClient()# Get a handle on the Mongo client - see note below
db = client.db_name# Where dbname is the name of the database you want
col = db.colname# Get a link to the collection you want to work with
Note - when you get the client, you might need to include authorisation details as follows:

client = MongoClient(databaseURL)
client.kms.authenticate(username,password)

We can insert our toon dictionary like this:

col.insert(toon)

and search it like this:

col.find({"Name":"Tom"})
col.find({"Activities" : {"$in": ["Exploding"]}})

The find() method returns a cursor which you must iterate through to access each of the documents that a search produces (even if there is only one). This is easy to do:

for doc in col.find():
   print(d)    # Or do something else with d

So d is a variable which takes the value of each document returned by the find() in turn. Looking at the find() examples, you should note that the key names ("Name" for example) are in double quotes, as is the $in.


Thursday, 24 October 2013

Installing the Software

A Sandpit

Let's build local sandpit first, you can run MongoDB and Python on an Apache web server to drive web content, but we'll come to that later. Step one is something to play with one your own PC. We will need:

  • A MongoDB server and client
  • The ability to run and develop Python scripts
  • A connection between Python and MongoDB
So let's install three things.

MongoDB

Installing MongoDB is easy. I'm doing it on a Windows PC, but there are versions of other OS choices too. Download yours here. Install instructions are given, but it just involves putting the executables in a directory of your choice.

There are two files of immediate interest. mongod, which is the database daemon and mongo provides a client shell. These are best run from the command line, so if you are using windows, get two running using the cmd program. Run mongod and leave it running. In another window, run mongo and you will see a shell into which you can type commands. The first thing I'd suggest you try is typing

> tutorial

As this runs a nice introductory tutorial.

More on what else to do with your shiny new DB later ...

Python

Once again, there are a number of OS choices for downloading Python. You can see them all here. Pick the right one for your OS (make a note of which you choose, including the 32 or 64 bit choice).

As a windows user, I also chose PyWin from here. This provides a simple environment for editing and running Python scripts and an interactive window for typing Python commands. Make sure you download the right one for the version of Python you installed (including 32 or 64 bit).

Now all we need is ...

PyMongo

PyMongo allows your Python scripts to talk to your MongoDB database. You can get it here. Got it? Cool.

Python has a nice method for installing packages. There is a script called easy_install (you'll find it in the Python\Scripts folder). In a command window, go to the Python scripts folder and run

easy_install pymongo

Then start your Python editor (PythonWin, for example).

The next three blogs look at first steps in MongoDB, Python and PyMongo.



Tuesday, 22 October 2013

Welcome

A Series of Blogs on MongoDB and Python

This is a blog about MongoDB and Python. I'm using them for a new project and I'll document what I learn as I go. MongoDB is a NoSQL database. An unfortunate set of naming decisions, but there you go. NoSQL doesn't even mean No SQL, it means not relational - an alternative to relational database models.

The MongoDB model involves storing data as collections of documents. There are no tables - one difference with the relational model. The definition of a document, while specific in the technical sense, is more general than the English language definition. It needn't be a written document, such as a letter or article or blog entry (though it could be). A document can contain data for any attributes, and each document in a collection can store data about different attributes from other documents in the same collection. Documents are stored as binary JSON objects (BSON), but we'll worry about that later.

For example, we might store details of people:

People
Name:Tom, Gender:Male
Name:Sally, Gender:Female

Now Tom and Sally are documents in the collection, People. So far, so like a relational table, except we can add more data such as:

Name:Jim, Age 25

Note, we don't store Jim's gender but we do store a new attribute - Age. This can be difficult in a relational table where we need to define all the fields at the start (or keep adding fields to every row each time a new one is needed, which is not very nice).

Python has a data structure called a dictionary, which is essentially an associative array. Dictionary objects can be populated from JSON objects, and MongoDB queries in Python can return dictionary objects, so it all works very nicely together.

In the next blog, I'll get all the software we need installed.