MongoDB + Python Norberto Leite Technical Evangelist norberto@mongodb.com
Agenda Introduction to MongoDB pymongo CRUD Aggregation GridFS Indexes ODMs
Ola, I'm Norberto Norberto Leite Technical Evangelist ! Madrid, Spain @nleite norberto@mongodb.com http://www.mongodb.com/norberto
MongoDB
MongoDB GENERAL PURPOSE DOCUMENT DATABASE OPEN-SOURCE
Fully Featured
MongoDB Features JSON Document Model with Dynamic Schemas Auto-Sharding for Horizontal Scalability Text Search Aggregation Framework and MapReduce Full, Flexible Index Support and Rich Queries Built-In Replication for High Availability Advanced Security Large Media Storage with GridFS
MongoDB Inc. 400+ employees 2,000+ customers Over $311 million in funding13 offices around the world
THE LARGEST ECOSYSTEM 9,000,000+ MongoDB Downloads 250,000+ Online Education Registrants 35,000+ MongoDB User Group Members 35,000+ MongoDB Management Service (MMS) Users 750+ Technology and Services Partners 2,000+ Customers Across All Industries
pymongo
pymongo • MongoDB Python official driver • Rockstart developer team • Jesse Jiryu Davis, Bernie Hackett • One of oldest and better maintained drivers • Python and MongoDB are a natural fit • BSON is very similar to dictionaries • (everyone likes dictionaries) • http://api.mongodb.org/python/current/ • https://github.com/mongodb/mongo-python-driver
pymongo 2.8 • Support for upcoming MongoDB 3.0 • New get collections and get indexes commands for Wired Tiger • Backward compatible w/ 2.6 ! • Future releases of pymongo (3.0) • Server discovery spec • Monitoring spec • Faster client startup when connecting to Replica Set • Faster failover • More robust replica set connections • API clean up
Connecting
Connecting #!/bin/python from pymongo import MongoClient ! mc = MongoClient() client  instance
Connecting #!/bin/python from pymongo import MongoClient ! uri = 'mongodb://127.0.0.1' mc = MongoClient(uri)
Connecting #!/bin/python from pymongo import MongoClient ! uri = 'mongodb://127.0.0.1' mc = MongoClient(host=uri, max_pool_size=10)
Connecting to Replica Set #!/bin/python from pymongo import MongoClient ! uri = ‘mongodb://127.0.0.1?replicaSet=MYREPLICA' mc = MongoClient(uri)
Connecting to Replica Set #!/bin/python from pymongo import MongoClient ! uri = ‘mongodb://127.0.0.1' mc = MongoClient(host=uri, replicaSet='MYREPLICA')
Database Instance #!/bin/python from pymongo import MongoClient mc = MongoClient() ! db = mc['zurich_pug'] ! #or ! db = mc.zurich_pug database  instance
Collection Instance #!/bin/python from pymongo import MongoClient mc = MongoClient() ! coll = mc[‘zurich_pug']['testcollection'] ! #or ! coll = mc.zurich_pug.testcollection collection  instance
CRUD http://www.ferdychristant.com/blog//resources/Web/$FILE/crud.jpg
Operations • Insert • Remove • Update • Query • Aggregate • Create Indexes • …
CRUD • Insert • Remove • Update • Query • Aggregate • Create Indexes • …
Insert #!/bin/python from pymongo import MongoClient mc = MongoClient() ! coll = mc['zurich_pug']['testcollection'] ! ! coll.insert( {'field_one': 'some value'})
Find #!/bin/python from pymongo import MongoClient mc = MongoClient() ! coll = mc['zurich_pug']['testcollection'] ! ! cur = coll.find( {'field_one': 'some value'}) ! for d in cur: print d
Update #!/bin/python from pymongo import MongoClient mc = MongoClient() ! coll = mc['zurich_pug']['testcollection'] ! ! result = coll.update( {'field_one': 'some value'}, {"$set": {'field_one': 'new_value'}} ) ! print(result) !
Remove #!/bin/python from pymongo import MongoClient mc = MongoClient() ! coll = mc['zurich_pug']['testcollection'] ! ! result = coll.remove( {'field_one': 'some value'}) ! print(result) !
Aggregate http://4.bp.blogspot.com/-­‐0IT3rIJkAtM/Uud2pTrGCbI/AAAAAAAABZM/-­‐XUK7j4ZHmI/s1600/snowflakes.jpg
Aggregation Framework • Analytical workload solution • Pipeline processing • Several Stages • $match • $group • $project • $unwind • $sort • $limit • $skip • $out ! • http://docs.mongodb.org/manual/aggregation/
Aggregation Framework #!/bin/python from pymongo import MongoClient mc = MongoClient() ! coll = mc['zurich_pug']['testcollection'] ! ! cur = coll.aggregate( [ {"$match": {'field_one': {"$exists": True }}} , {"$project": { "new_label": "$field_one" }} ] ) ! for d in cur: print(d)
GridFS http://www.appuntidigitali.it/site/wp-­‐content/uploads/rawdata.png
GridFS • MongoDB has a 16MB document size limit • So how can we store data bigger than 16MB? • Media files (images, pdf’s, long binary files …) • GridFS • Convention more than a feature • All drivers implement this convention • pymongo is no different • Very flexible approach • Handy out-of-the-box solution
GridFS #!/bin/python   from  pymongo  import  MongoClient   import  gridfs   ! ! mc  =  MongoClient()   database  =  mc.grid_example   ! ! gfs  =  gridfs.GridFS(  database)   ! read_file  =  open(  '/tmp/somefile',  'r')   ! gfs.put(read_file,  author='Norberto',  tags=['awesome',  'zurich',  'pug'])   call  grids  lib  w/  database
GridFS #!/bin/python   from  pymongo  import  MongoClient   import  gridfs   ! ! mc  =  MongoClient()   database  =  mc.grid_example   ! ! gfs  =  gridfs.GridFS(  database)   ! read_file  =  open(  '/tmp/somefile',  'r')   ! gfs.put(read_file,  author='Norberto',  tags=['awesome',  'zurich',  'pug'])   open  file  for  reading
GridFS #!/bin/python   from  pymongo  import  MongoClient   import  gridfs   ! ! mc  =  MongoClient()   database  =  mc.grid_example   ! ! gfs  =  gridfs.GridFS(  database)   ! read_file  =  open(  '/tmp/somefile',  'r')   ! gfs.put(read_file,  author='Norberto',  tags=['awesome',  'zurich',  'pug'])   call  put  to  store  file  and   metadata
GridFS mongo   nair(mongod-­‐3.1.0-­‐pre-­‐)  grid_sample>  show  dbs   grid_sample    0.246GB   local                0.000GB   nair(mongod-­‐3.1.0-­‐pre-­‐)  grid_sample>  show  collections   fs.chunks                      258.995MB  /  252.070MB   fs.files                        0.000MB  /  0.016MB   database  created
GridFS mongo   nair(mongod-­‐3.1.0-­‐pre-­‐)  grid_sample>  show  dbs   grid_sample    0.246GB   local                0.000GB   nair(mongod-­‐3.1.0-­‐pre-­‐)  grid_sample>  show  collections   fs.chunks                      258.995MB  /  252.070MB   fs.files                        0.000MB  /  0.016MB   2  collections
GridFS mongo   nair(mongod-­‐3.1.0-­‐pre-­‐)  grid_sample>  show  dbs   grid_sample    0.246GB   local                0.000GB   nair(mongod-­‐3.1.0-­‐pre-­‐)  grid_sample>  show  collections   fs.chunks                      258.995MB  /  252.070MB   fs.files                        0.000MB  /  0.016MB   chunks  collection  holds  binary  data   files  holds  metada  data
Indexes
Indexes • Single Field • Compound • Multikey • Geospatial • 2d • 2dSphere - GeoJSON • Full Text • Hash Based • TTL indexes • Unique • Sparse
Single Field Index from pymongo import ASCENDING, MongoClient mc = MongoClient() ! coll = mc.zurich_pug.testcollection ! coll.ensure_index( 'some_single_field', ASCENDING ) indexed  field indexing  order
Compound Field Index from pymongo import ASCENDING, DESCENDING, MongoClient mc = MongoClient() ! coll = mc.zurich_pug.testcollection ! coll.ensure_index( [('field_ascending', ASCENDING), ('field_descending', DESCENDING)] ) indexed  fields indexing  order
Multikey Field Index mc = MongoClient() ! coll = mc.zurich_pug.testcollection ! ! coll.insert( {'array_field': [1, 2, 54, 89]}) ! coll.ensure_index( 'array_field') indexed  field
Geospatial Field Index from pymongo import GEOSPHERE import geojson ! ! p = geojson.Point( [-73.9858603477478, 40.75929362758241]) ! coll.insert( {'point', p) ! coll.ensure_index( [( 'point', GEOSPHERE )]) index  type
ODM and others
Friends • mongoengine • http://mongoengine.org/ • Motor • http://motor.readthedocs.org/en/stable/ • async driver • Tornado • Greenlets • ming • http://sourceforge.net/projects/merciless/
Let's recap
Recap • MongoDB is Awesome • Specially to work with Python • pymongo • super well supported • fully in sync with MongoDB server
MongoDB 3.0 is coming
3.0.0-RC8 https://www.mongodb.org/downloads http://www.mongodb.com/lp/misc/norberto-leite https://jira.mongodb.org/secure/Dashboard.jspa Please go and test it!
Obrigado! Norberto Leite Technical Evangelist @nleite norberto@mongodb.com
MongoDB and Python

MongoDB and Python