© All rights reserved: Moshe Kaplan Big Data – Leading Platforms© All rights reserved: Moshe Kaplan MongoDB for Developers For Developers
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms© All rights reserved: Moshe Kaplan MongoDB for Developers MongoDB For Developers Moshe Kaplan Scale Hacker http://top-performance.blogspot.com http://blogs.microsoft.co.il/vprnd
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms It’s all About Scale
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms HELLO. MY NAME IS MONGODB Introduction
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms Who is Using mongoDB? 5
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms Who is Behind mongoDB
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms Key Value Store (with benefits) • insert • get • multiget • remove • truncate 7 <Key, Value> ://wiki.apache.org/cassandra/API
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms When Should I Choose NoSQL? • Eventually Consistent • Document Store • Key Value 8 http://guyharrison.squarespace.com/blog/tag/nosq
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms What mongoDB is Made of? 9 http://www.10gen.com/products/mongodb
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms Why MongoDB? What? Why? JSON End to End No Schema “No DBA”, Just Serialize Write 10K Inserts/sec on virtual machine Read Similar to MySQL HA 10 min to setup a cluster Sharding Out of the Box GeoData Great for that No Schema None: no downtime to create new columns Buzz Trend is with NoSQL 10
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms NoSQL and Data Modeling What is the Difference
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms Database for Software Engineers Class Subclass Document Subdocument
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms Same Terminology • Database  Database • Table  Collection • Row  Document
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms A Blog Case Study in MySQL http://www.slideshare.net/nateabele/building-apps-with-mongodb
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms as a SW Engineer would like it to be… http://www.slideshare.net/nateabele/building-apps-with-mongodb
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms Migration from RDBMS to NoSQL How to do that?
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms Data Migration • Map the table structure • Export the data and Import It • Add Indexes http://igcse-geography-lancaster.wikispaces.com/1.2+MIGRATION
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms Selected Migration Tool
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms Usage Details> Install ruby > gem install mongify … Modify the code to your needs … Create configuration files > mongify translation db.config > translation.rb > mongify process db.config translation.rb
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms Date Functions • Year(), Month()… function included • … buy only in the JavaScript engine • Solution: New fields! • [original field] • [original field]_[year part] • [original field]_[month part] • [original field]_[day part] • [original field]_[hour part]
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms NO SCHEMA IS A GOOD THING BUT… Schemaless
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms Default Values • No Schema • No Default Values • App Challenge • Timestamps… No single source of truth
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms Casting and Type Safety • No Schema • No … • App Challenge
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms Auto Numbers • Start using _id { "_id" : 0, "health" : 1, "stateStr" : "PRIMARY", "uptime" : 59917 } • Counter tables • Dedicated database • 1:1 Mapping • Counter++ using findAndModify
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms The ORM Solution
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms Data Analysts http://www.designersplayground.com/pr/internet-meme-list/data-analyst-2/
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms Data Analysts • This is not SQL • There are no joins • No perfect tools Pentaho RockMongoMongoVUE RoboMongo
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms No Joins • Do in the application • Leverage the power of NoSQL http://www.slideshare.net/nateabele/building-apps-with-mongodb
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms Limited Resultset • 16MB document size • Limit and Skip • Adjusted WHERE • GridFS
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms Bottom Line • Powerful tool • Embrace the Challenge • Schema-less limitations: counters, data types • Tools for Data Scientists • Data design
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms Mastering a New Query Language
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms Connect to the Database • Connect: • > mongo • Show current database: • >> db • Show Databases • >> show databases; • Show Collections • >> show collections; or show tables;
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms Databases Manipulation: Create & Drop • Change Database: • >> use <database> • Create Database • Just switch and create an object… • Delete Database • > use mydb; • > db.dropDatabase();
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms Collections Manipulation • Create Collcation >db.createCollection(collectionName) • Delete Collection > db.collectionName.drop() Or just insert to it
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms SELECT: No SQL, just ORM… • Select All • db.things.find() • WHERE • db.posts.find({“comments.email” : ”b@c.com”}) • Pattern Matching • db.posts.find( {“title” : /mongo/i} ) • Sort • db.posts.find().sort({email : 1, date : -1}); • Limit • db.posts.find().limit(3)
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms Specific fields Select All db.users.find( { }, { user_id: 1, status: 1, _id: 0 } ) 1: Show; 0: don’t show
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms WHERE • != “A” { $ne: "A" } • > 25 { $gt: 25 } • > 25 AND <= 50 { $gt: 25, $lte: 50 } • Like ‘bc%’ /^bc/ • < 25 OR >= 50 { $or : [ { $lt: 25 }, { $gte : 50 } ] }
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms Join • Wrong Place… • Or Map Reduce
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms 39  db.article.aggregate(  { $group : {  _id : { author : "$author“, name : “$name” },  docsPerAuthor : { $sum : 1 },  viewsPerAuthor : { $sum : "$pageViews" }  }}  ); GROUP BY < GROUP BY author, name < SUM(pageViews) < SUM(1) = N
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms 40 db.posts.update( {“comments.email”: ”b@c.com”}, {$set : {“comments.email”: ”d@c.com”}} } SET age = age + 3 • db.users.update( • { status: "A" } , • { $inc: { age: 3 } }, • { multi: true } • ) UPDATE
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms 41 j = { name : "mongo" } k = { x : 3 } db.things.insert( j ) db.things.insert( k ) INSERT
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms 42 db.users.remove( { status: "D" } ) DELETE
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms Performance Tuning Make a Change
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms MONGODB TUNING
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms journalCommitInterval = 300: Write to disk: 2ms <= t <= 300ms Default 100ms, increase to 300ms to save resources Disk The Journal Memory Journal Data 1 2
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms RAM Optimization: dataSize + indexSize < RAM OS Data Index Journal
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms PROFILING AND SLOW LOG
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms Profiling Configuration • Enable: • mongod --profile=1 --slowms=15 • db.setProfilingLevel([level] , [time]) • How much: • 0 (none)  1 (slow queries only)  2 (all) • 100ms: default • Where: • system.profile collection @ local db
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms Profiling Results Analysis • Last 5 >1ms: show profile • w/o commands: db.system.profile.find( { op: { $ne : 'command' } } ).pretty() • Specific database: db.system.profile.find( { ns : 'mydb.test' } ).pretty() • Slower than: db.system.profile.find( { millis : { $gt : 5 } } ).pretty() • Between dates: db.system.profile.find({ts : { $gt : new ISODate("2012-12-09T03:00:00Z") , $lt : new ISODate("2012-12-09T03:40:00Z") }}).pretty()
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms Explain > db.courses.find().explain(); { "cursor" : "BasicCursor", "isMultiKey" : false, "n" : 11, “nscannedObjects" : 11, "nscanned" : 11, "nscannedObjectsAllPlans" : 11, "nscannedAllPlans" : 11, "scanAndOrder" : false, "indexOnly" : false, "nYields" : 0, "nChunkSkips" : 0, "millis" : 0, "indexBounds" : {}, "server" : "primary.domain.com:27017" }
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms INDEXES
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms Index Management • Regular Index • db.users.ensureIndex( { user_id: 1 } ) • Multiple + DESC Index • db.users.ensureIndex( { user_id: 1, age: -1 } ) • Sub Document Index • db.users.ensureIndex( { address.zipcode: 1 } ) • List Indexes • db.users.getIndexes() • Drop Indexes • db.users.dropIndex(“indexName”)
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms Known Index Issues • Bound filter should be the last (in the index as well). • BitMap Indexes not really working • You should design your indexes carefully
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms STATS & SCHEMA DESIGN
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms Sparse Matrix? I don’t Think so • mongostat • > db.stats(); • > db.collectionname.stats(); • Fragmentation if storageSize/size > 2 • db.collectionanme.runCommand(“compact”) • Padding (wrong design) if paddingFactor > 2
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms High Availability Going Real Time
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms (Do Not) Master/Slave
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms • In mongo.conf • # Replication Options • replSet=myReplSet • > rs.initiate() • > rs.conf() • > rs.add(“host:port") • rs.reconfig() Replication Set
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms • rs.addArb(“host:port") • Also: • Low Priority • Hidden • (Weighted) Voting Arbiter
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms Show Status: rs.status(); • {"set" : “myReplSet", "date" : ISODate("2013-02-05T10:23:28Z"), • "myState" : 1, • "members" : [ • { • "_id" : 0, "name" : "primary.example.com:27017", • "health" : 1, "state" : 1, • "stateStr" : "PRIMARY", "uptime" : 164545, • "optime" : Timestamp(1359901753000, 1), • "optimeDate" : ISODate("2013-02- 03T14:29:13Z"), "self" : true • }, • { • "_id" : 1, "name"
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms Replica Set Recovery • Create a new mongod • Either install a plain vanilla • Or duplicate existing mongod (better) • Connect to the system • Use the previous machine IP • Or change configuration to remove old and add new
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms Sharding and Scale out: Make a big Change Map Reduce and Aggregation
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms Secondary Read Enabling
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms The Strategy : Sharding
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms MongoDB Implementation
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms© All rights reserved: Moshe Kaplan MongoDB for Developers Summary • NoSQL • Schemaless • HA • Sharding
© All rights reserved: Moshe Kaplan Big Data – Leading Platforms© All rights reserved: Moshe Kaplan MongoDB for Developers Thank You ! Moshe Kaplan moshe.kaplan@brightaqua.com 054-2291978

MongoDB Best Practices for Developers

  • 1.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms© All rights reserved: Moshe Kaplan MongoDB for Developers For Developers
  • 2.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms© All rights reserved: Moshe Kaplan MongoDB for Developers MongoDB For Developers Moshe Kaplan Scale Hacker http://top-performance.blogspot.com http://blogs.microsoft.co.il/vprnd
  • 3.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms It’s all About Scale
  • 4.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms HELLO. MY NAME IS MONGODB Introduction
  • 5.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms Who is Using mongoDB? 5
  • 6.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms Who is Behind mongoDB
  • 7.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms Key Value Store (with benefits) • insert • get • multiget • remove • truncate 7 <Key, Value> ://wiki.apache.org/cassandra/API
  • 8.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms When Should I Choose NoSQL? • Eventually Consistent • Document Store • Key Value 8 http://guyharrison.squarespace.com/blog/tag/nosq
  • 9.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms What mongoDB is Made of? 9 http://www.10gen.com/products/mongodb
  • 10.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms Why MongoDB? What? Why? JSON End to End No Schema “No DBA”, Just Serialize Write 10K Inserts/sec on virtual machine Read Similar to MySQL HA 10 min to setup a cluster Sharding Out of the Box GeoData Great for that No Schema None: no downtime to create new columns Buzz Trend is with NoSQL 10
  • 11.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms NoSQL and Data Modeling What is the Difference
  • 12.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms Database for Software Engineers Class Subclass Document Subdocument
  • 13.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms Same Terminology • Database  Database • Table  Collection • Row  Document
  • 14.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms A Blog Case Study in MySQL http://www.slideshare.net/nateabele/building-apps-with-mongodb
  • 15.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms as a SW Engineer would like it to be… http://www.slideshare.net/nateabele/building-apps-with-mongodb
  • 16.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms Migration from RDBMS to NoSQL How to do that?
  • 17.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms Data Migration • Map the table structure • Export the data and Import It • Add Indexes http://igcse-geography-lancaster.wikispaces.com/1.2+MIGRATION
  • 18.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms Selected Migration Tool
  • 19.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms Usage Details> Install ruby > gem install mongify … Modify the code to your needs … Create configuration files > mongify translation db.config > translation.rb > mongify process db.config translation.rb
  • 20.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms Date Functions • Year(), Month()… function included • … buy only in the JavaScript engine • Solution: New fields! • [original field] • [original field]_[year part] • [original field]_[month part] • [original field]_[day part] • [original field]_[hour part]
  • 21.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms NO SCHEMA IS A GOOD THING BUT… Schemaless
  • 22.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms Default Values • No Schema • No Default Values • App Challenge • Timestamps… No single source of truth
  • 23.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms Casting and Type Safety • No Schema • No … • App Challenge
  • 24.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms Auto Numbers • Start using _id { "_id" : 0, "health" : 1, "stateStr" : "PRIMARY", "uptime" : 59917 } • Counter tables • Dedicated database • 1:1 Mapping • Counter++ using findAndModify
  • 25.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms The ORM Solution
  • 26.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms Data Analysts http://www.designersplayground.com/pr/internet-meme-list/data-analyst-2/
  • 27.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms Data Analysts • This is not SQL • There are no joins • No perfect tools Pentaho RockMongoMongoVUE RoboMongo
  • 28.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms No Joins • Do in the application • Leverage the power of NoSQL http://www.slideshare.net/nateabele/building-apps-with-mongodb
  • 29.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms Limited Resultset • 16MB document size • Limit and Skip • Adjusted WHERE • GridFS
  • 30.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms Bottom Line • Powerful tool • Embrace the Challenge • Schema-less limitations: counters, data types • Tools for Data Scientists • Data design
  • 31.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms Mastering a New Query Language
  • 32.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms Connect to the Database • Connect: • > mongo • Show current database: • >> db • Show Databases • >> show databases; • Show Collections • >> show collections; or show tables;
  • 33.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms Databases Manipulation: Create & Drop • Change Database: • >> use <database> • Create Database • Just switch and create an object… • Delete Database • > use mydb; • > db.dropDatabase();
  • 34.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms Collections Manipulation • Create Collcation >db.createCollection(collectionName) • Delete Collection > db.collectionName.drop() Or just insert to it
  • 35.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms SELECT: No SQL, just ORM… • Select All • db.things.find() • WHERE • db.posts.find({“comments.email” : ”b@c.com”}) • Pattern Matching • db.posts.find( {“title” : /mongo/i} ) • Sort • db.posts.find().sort({email : 1, date : -1}); • Limit • db.posts.find().limit(3)
  • 36.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms Specific fields Select All db.users.find( { }, { user_id: 1, status: 1, _id: 0 } ) 1: Show; 0: don’t show
  • 37.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms WHERE • != “A” { $ne: "A" } • > 25 { $gt: 25 } • > 25 AND <= 50 { $gt: 25, $lte: 50 } • Like ‘bc%’ /^bc/ • < 25 OR >= 50 { $or : [ { $lt: 25 }, { $gte : 50 } ] }
  • 38.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms Join • Wrong Place… • Or Map Reduce
  • 39.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms 39  db.article.aggregate(  { $group : {  _id : { author : "$author“, name : “$name” },  docsPerAuthor : { $sum : 1 },  viewsPerAuthor : { $sum : "$pageViews" }  }}  ); GROUP BY < GROUP BY author, name < SUM(pageViews) < SUM(1) = N
  • 40.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms 40 db.posts.update( {“comments.email”: ”b@c.com”}, {$set : {“comments.email”: ”d@c.com”}} } SET age = age + 3 • db.users.update( • { status: "A" } , • { $inc: { age: 3 } }, • { multi: true } • ) UPDATE
  • 41.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms 41 j = { name : "mongo" } k = { x : 3 } db.things.insert( j ) db.things.insert( k ) INSERT
  • 42.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms 42 db.users.remove( { status: "D" } ) DELETE
  • 43.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms Performance Tuning Make a Change
  • 44.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms MONGODB TUNING
  • 45.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms journalCommitInterval = 300: Write to disk: 2ms <= t <= 300ms Default 100ms, increase to 300ms to save resources Disk The Journal Memory Journal Data 1 2
  • 46.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms RAM Optimization: dataSize + indexSize < RAM OS Data Index Journal
  • 47.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms PROFILING AND SLOW LOG
  • 48.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms Profiling Configuration • Enable: • mongod --profile=1 --slowms=15 • db.setProfilingLevel([level] , [time]) • How much: • 0 (none)  1 (slow queries only)  2 (all) • 100ms: default • Where: • system.profile collection @ local db
  • 49.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms Profiling Results Analysis • Last 5 >1ms: show profile • w/o commands: db.system.profile.find( { op: { $ne : 'command' } } ).pretty() • Specific database: db.system.profile.find( { ns : 'mydb.test' } ).pretty() • Slower than: db.system.profile.find( { millis : { $gt : 5 } } ).pretty() • Between dates: db.system.profile.find({ts : { $gt : new ISODate("2012-12-09T03:00:00Z") , $lt : new ISODate("2012-12-09T03:40:00Z") }}).pretty()
  • 50.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms Explain > db.courses.find().explain(); { "cursor" : "BasicCursor", "isMultiKey" : false, "n" : 11, “nscannedObjects" : 11, "nscanned" : 11, "nscannedObjectsAllPlans" : 11, "nscannedAllPlans" : 11, "scanAndOrder" : false, "indexOnly" : false, "nYields" : 0, "nChunkSkips" : 0, "millis" : 0, "indexBounds" : {}, "server" : "primary.domain.com:27017" }
  • 51.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms INDEXES
  • 52.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms Index Management • Regular Index • db.users.ensureIndex( { user_id: 1 } ) • Multiple + DESC Index • db.users.ensureIndex( { user_id: 1, age: -1 } ) • Sub Document Index • db.users.ensureIndex( { address.zipcode: 1 } ) • List Indexes • db.users.getIndexes() • Drop Indexes • db.users.dropIndex(“indexName”)
  • 53.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms Known Index Issues • Bound filter should be the last (in the index as well). • BitMap Indexes not really working • You should design your indexes carefully
  • 54.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms STATS & SCHEMA DESIGN
  • 55.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms Sparse Matrix? I don’t Think so • mongostat • > db.stats(); • > db.collectionname.stats(); • Fragmentation if storageSize/size > 2 • db.collectionanme.runCommand(“compact”) • Padding (wrong design) if paddingFactor > 2
  • 56.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms High Availability Going Real Time
  • 57.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms (Do Not) Master/Slave
  • 58.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms • In mongo.conf • # Replication Options • replSet=myReplSet • > rs.initiate() • > rs.conf() • > rs.add(“host:port") • rs.reconfig() Replication Set
  • 59.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms • rs.addArb(“host:port") • Also: • Low Priority • Hidden • (Weighted) Voting Arbiter
  • 60.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms Show Status: rs.status(); • {"set" : “myReplSet", "date" : ISODate("2013-02-05T10:23:28Z"), • "myState" : 1, • "members" : [ • { • "_id" : 0, "name" : "primary.example.com:27017", • "health" : 1, "state" : 1, • "stateStr" : "PRIMARY", "uptime" : 164545, • "optime" : Timestamp(1359901753000, 1), • "optimeDate" : ISODate("2013-02- 03T14:29:13Z"), "self" : true • }, • { • "_id" : 1, "name"
  • 61.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms Replica Set Recovery • Create a new mongod • Either install a plain vanilla • Or duplicate existing mongod (better) • Connect to the system • Use the previous machine IP • Or change configuration to remove old and add new
  • 62.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms Sharding and Scale out: Make a big Change Map Reduce and Aggregation
  • 63.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms Secondary Read Enabling
  • 64.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms The Strategy : Sharding
  • 65.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms MongoDB Implementation
  • 66.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms© All rights reserved: Moshe Kaplan MongoDB for Developers Summary • NoSQL • Schemaless • HA • Sharding
  • 67.
    © All rightsreserved: Moshe Kaplan Big Data – Leading Platforms© All rights reserved: Moshe Kaplan MongoDB for Developers Thank You ! Moshe Kaplan moshe.kaplan@brightaqua.com 054-2291978