Monday, 2 December 2013

Python for MongoDB

Python For MongoDB

This isn't a tutorial on programming Python. If you're new to the language, there are plenty of online resources to get you going. Many are listed here. I've already covered installing Python and the MongoDB package PyMongo, so what are we doing here? Well, two things:
  • A quick look at the bits of Python that are useful for use with MongoDB 
  • Some examples of connecting and manipulating a MongoDB from Python

JSON, dict and find()

MongoDB stores data in binary JSON format. JSON, if you haven't seen it, is a format for storing data in a human readable form. There is a nice quick introduction to it here. You don't need any particular software to use it, though it is supported in pretty much any programming language you might pick. JSON objects are defined as either name:value pairs or arrays. The value part of a name:value pair can be an array or a set of name:value pairs and an array can contain name:value pairs. Some examples will clarify:

Here we define a single object and its properties:

{"Name":"Tom","Animal":"Cat","Activities":["Chasing Mice","Exploding"]}

The object is enclosed in curly braces {}, the name:value pairs are enclosed in double quotes "" and the array is enclosed in square brackets []. Values can be objects themselves:

{"Cartoon":"Roadrunner", "Main Character":{"Name":"Roadrunner","Speed":"Fast"}}

Values can also be arrays of objects:

{"Cartoon":"Roadrunner", "Character":[{"Name":"Roadrunner","Speed":"Fast"},{"Name":"Wiley Coyote","Speed":"Slow"]}

JSON supports a number of data types, not just strings as used above. They are string, number and boolean. Objects and arrays are also data types, and there is a null type, which means the value is empty.

Okay, so what? Well, Python has a data structure known as a dictionary, which pretty much mimics the JSON format.

Let's switch to Python now. Run your chosen Python program (I suggested PythonWin for Windows, and that is what I'll use here). Most offer an interactive window for testing short bits of code and that is what we will use.

You can define a dictionary like this:

toon = {"Name":"Tom","Animal":"Cat","Activities":["Chasing Mice","Exploding"]}

There is also the dict() constructor, which takes an array of key value pairs:

toon = dict([("Name","Tom"),("Animal","Cat"),("Activities",["Chasing Mice","Exploding"])])

And you can extract named elements like this:

toon["Name"]

which would return "Tom". There are other useful things you can do with a dictionary - some are explained here.

Finally, we stay with Python, but start using PyMongo to interact with the database. First, we need to import the PyMongo library and get a connection to the database:


import pymongo# Import the pymongo library
from pymongo import MongoClient# and the MongoClient
client = MongoClient()# Get a handle on the Mongo client - see note below
db = client.db_name# Where dbname is the name of the database you want
col = db.colname# Get a link to the collection you want to work with
Note - when you get the client, you might need to include authorisation details as follows:

client = MongoClient(databaseURL)
client.kms.authenticate(username,password)

We can insert our toon dictionary like this:

col.insert(toon)

and search it like this:

col.find({"Name":"Tom"})
col.find({"Activities" : {"$in": ["Exploding"]}})

The find() method returns a cursor which you must iterate through to access each of the documents that a search produces (even if there is only one). This is easy to do:

for doc in col.find():
   print(d)    # Or do something else with d

So d is a variable which takes the value of each document returned by the find() in turn. Looking at the find() examples, you should note that the key names ("Name" for example) are in double quotes, as is the $in.


No comments:

Post a Comment