Python: First Steps with MongoDB

I’m busy working my way through Kyle Banker’s MongoDB in Action. Much of the example code in the book is given in Ruby. Despite the fact that I’d love to learn more about Ruby, for the moment it makes more sense for me to follow along with Python.

mongodb-logo

MongoDB Installation

If you haven’t already installed MongoDB, now is the time to do it! On a Debian Linux system the installation is very simple.

$ sudo apt install mongodb

Python Package Installation

Next install PyMongo, the Python driver for MongoDB.

$ pip3 install pymongo

Check that the install was successful.

>>> import pymongo
>>> pymongo.version
'3.3.0'

Detailed documentation for PyMongo can be found here.

Creating a Client

To start interacting with the MongoDB server we need to instantiate a MongoClient.

>>> client = pymongo.MongoClient()

This will connect to localhost using the default port. Alternative values for host and port can be specified.

Connect to a Database

Next we connect to a particular database called test. If the database does not yet exist then it will be created.

>>> db = client.test

Create a Collection

A database will hold one or more collections of documents. We’ll create a users collection.

>>> users = db.users
>>> users
Collection(Database(MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True),
'test'), 'users')

As mentioned in the documentation, MongoDB is lazy about the creation of databases and collections. Neither the database nor collection is actually created until data are written to them.

Working with Documents

As you would expect, MongoDB caters for the four basic CRUD operations.

Create

Documents are represented as dictionaries in Python. We’ll create a couple of light user profiles.

>>> smith = {"last_name": "Smith", "age": 30}
>>> jones = {"last_name": "Jones", "age": 40}

We use the insert_one() method to store each document in the collection.

>>> users.insert_one(smith)
<pymongo.results.InsertOneResult object at 0x7f57d36d9678>

Each document is allocated a unique identifier which can be accessed via the inserted_id attribute.

>>> jones_id = users.insert_one(jones).inserted_id
>>> jones_id
ObjectId('57ea4adfad4b2a1378640b42')

Although these identifiers look pretty random, there is actually a wel defined structure. The first 8 characters (4 bytes) are a timestamp, followed by a 6 character machine identifier then a 4 character process identifier and finally a 6 character counter.

We can verify that the collection has been created.

>>> db.collection_names()
['users', 'system.indexes']

There’s also an insert_many() method which can be used to simultaneously insert multiple documents.

Read

The find_one() method can be used to search the collection. As its name implies it only returns a single document.

>>> users.find_one({"last_name": "Smith"})
{'_id': ObjectId('57ea4acfad4b2a1378640b41'), 'age': 30, 'last_name': 'Smith'}
>>> users.find_one({"_id": jones_id})
{'_id': ObjectId('57ea4adfad4b2a1378640b42'), 'age': 40, 'last_name': 'Jones'}

A more general query can be made using the find() method which, rather than returning a document, returns a cursor which can be used to iterate over the results. With our minimal collection this doesn’t seem very useful, but a cursor really comes into its own with a massive collection.

>>> users.find({"last_name": "Smith"})
<pymongo.cursor.Cursor object at 0x7f57d77fe3c8>
>>> users.find({"age": {"$gt": 20}})
<pymongo.cursor.Cursor object at 0x7f57d77fe8d0>

A cursor is an iterable and can be used to neatly access the query results.

>>> cursor = users.find({"age": {"$gt": 20}})
>>> for user in cursor:
...     user["last_name"]
... 
'Smith'
'Jones'

Operations like count() and sort() can be applied to the results returned by find().

Update

The update() method is used to modify existing documents. A compound document is passed as the argument to update(), the first part of which is used to match those documents to which the change is to be applied and the second part gives the details of the change.

>>> users.update({"last_name": "Smith"}, {"$set": {"city": "Durban"}})
{'updatedExisting': True, 'nModified': 1, 'n': 1, 'ok': 1}

The example above uses the $set modifier. There are a number of other modifiers available like $inc, $mul, $rename and $unset.

By default the update is only applied to the first matching record. The change can be applied to all matching records by specifying multi = True.

Delete

Deleting records happens via the remove() method with an argument which specifies which records are to be deleted.

>>> users.remove({"age": {"$gte": 40}})
{'n': 1, 'ok': 1}

Conclusion

Well those are the basic operations. Nothing too scary. I’ll be back with the Python implementation of the Twitter archival sample application.

MongoDB: Installing on Windows 7

It’s not my personal choice, but I have to spend a lot of my time working under Windows. Installing MongoDB under Ubuntu is a snap. Getting it going under Windows seems to require jumping through a few more hoops. Here are my notes. I hope that somebody will find them useful.

  1. Download the installation. This will be an MSI installer package with a name like mongodb-win32-x86_64-2008plus-ssl-3.2.0-signed.msi.
  2. Run the installer with a deft double-click.
    MongoDB-install-dialog

  3. Accept the License Agreement.
  4. Select the Complete installation type and click Install.
  5. Briefly browse YouTube (you won’t have time to make a cup of coffee).
  6. When the installation is complete press Finish.
  7. Reboot your machine. This might not be entirely necessary, but my Windows experience tells me that it is never a bad idea.
  8. Create a folder for the data files. By default this will be C:\data\db.
  9. Create a folder for the log files. By default this will be C:\data\log.
  10. Open a command prompt, change the working directory to C:\Program Files\MongoDB\Server\3.2\bin and start the database server, mongod.exe.
    MongoDB-starting-server

At this stage you should be ready to roll. Open another command prompt and start the database client, mongo.exe which you’ll find in the same folder as mongod.exe.

To make your installation a little more robust, you can also do the following:

  1. Create a configuration file at C:\Program Files\MongoDB\Server\3.2\mongod.cfg. For starters you could enter the following configuration directives:
    systemLog:
        destination: file
        path: c:\data\log\mongod.log
    storage:
        dbPath: c:\data\db
    
  2. Install MongoDB as a service by running
    mongod.exe --config "C:\Program Files\MongoDB\Server\3.2\mongod.cfg" --install
    
  3. The service can then be launched with
    net start MongoDB
    

    And stopping the service is as simple as

    net stop MongoDB