open-source, high-performance, document-oriented database
Schema Design Basics roger@10Gen.com
This talk
This talk ‣Intro -Terms / Definitions
This talk ‣Intro -Terms / Definitions
This talk ‣Intro -Terms / Definitions ‣Getting a flavor -Creating a Schema -Indexes -Evolving the Schema
This talk ‣Intro -Terms / Definitions ‣Getting a flavor -Creating a Schema -Indexes -Evolving the Schema ‣Data modeling -DBRef -Single Table Inheritance -Many - Many -Trees -Lists / Queues / Stacks
Document Oriented Basic unit of data: JSON Documents Not Relational, Key Value Not OODB - Associations implied by Document Structure - but your database schema != your program schema
Terms Table -> Collection Row(s) -> JSON Document Index -> Index Join -> Embedding and Linking across documents Partition -> Shard Partition Key -> Shard Key
Considerations What are the requirements ? - Functionality to be supported - Access Patterns ? - Data Life Cycle (insert, update, deletes) - Expected Performance / Workload ? Capabilities of the database ?
DB Considerations How can we manipulate this data ? Dynamic Queries Secondary Indexes Atomic Updates Map Reduce Access Patterns ? Read / Write Ratio Types of updates Types of queries Considerations No Joins Single Document Transactions only
Design Session Use Rich Design Documents post = {author: “kyle”, date: new Date(), text: “my blog post...”, tags: [“mongodb”, “intro”]} >db.post.save(post)
>db.posts.find() { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : "kyle", date : "Sat Jul 24 2010 19:47:11 GMT-0700 (PDT)", text : "My first blog", tags : [ "mongodb", "intro" ] } Notes: - ID is unique, but can be anything you’d like
Secondary index for “author” // 1 means ascending, -1 means descending >db.posts.ensureIndex({author: 1}) >db.posts.find({author: 'kyle'}) { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : "kyle", ... }
Verifying indexes exist >db.system.indexes.find() // Index on ID { name : "_id_", ns : "test.posts", key : { "_id" : 1 } }
Verifying indexes exist >db.system.indexes.find() // Index on ID { name : "_id_", ns : "test.posts", key : { "_id" : 1 } } // Index on author { _id : ObjectId("4c4ba6c5672c685e5e8aabf4"), ns : "test.posts", key : { "author" : 1 }, name : "author_1" }
Query operators Conditional operators: $ne, $in, $nin, $mod, $all, $size, $exists, $type, .. $lt, $lte, $gt, $gte, $ne, // find posts with any tags >db.posts.find({tags: {$exists: true}})
Query operators Conditional operators: $ne, $in, $nin, $mod, $all, $size, $exists, $type, .. $lt, $lte, $gt, $gte, $ne, // find posts with any tags >db.posts.find({tags: {$exists: true}}) Regular expressions: // posts where author starts with k >db.posts.find({author: /^k*/i })
Query operators Conditional operators: $ne, $in, $nin, $mod, $all, $size, $exists, $type, .. $lt, $lte, $gt, $gte, $ne, // find posts with any tags >db.posts.find({tags: {$exists: true}}) Regular expressions: // posts where author starts with k >db.posts.find({author: /^k*/i }) Counting: // posts written by mike >db.posts.find({author: “mike”}).count()
Extending the Schema comment = {author: “fred”, date: new Date(), text: “super duper”} update = { ‘$push’: {comments: comment}, ‘$inc’: {comments_count: 1}} >db.posts.update({_id: “...” }, update)
{ _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : "kyle", date : "Sat Jul 24 2010 19:47:11 GMT-0700 (PDT)", text : "My first blog", tags : [ "mongodb", "intro" ], comments_count: 1, comments : [ { author : "Fred", date : "Sat Jul 24 2010 20:51:03 GMT-0700 (PDT)", text : "Super Duper" } ]}
// create index on nested documents: >db.posts.ensureIndex({"comments.author": 1}) >db.posts.find({comments.author:”Fred”})
// create index on nested documents: >db.posts.ensureIndex({"comments.author": 1}) >db.posts.find({comments.author:”kyle”}) // find last 5 posts: >db.posts.find().sort({date:-1}).limit(5)
// create index on nested documents: >db.posts.ensureIndex({"comments.author": 1}) >db.posts.find({comments.author:”kyle”}) // find last 5 posts: >db.posts.find().sort({date:-1}).limit(5) // most commented post: >db.posts.find().sort({comments_count:-1}).limit(1) When sorting, check if you need an index
Map Reduce Aggregation and batch manipulation Collection in, Collection out Parallel in sharded environments
Map reduce mapFunc = function () { this.tags.forEach(function (z) {emit(z, {count:1});}); } reduceFunc = function (k, v) { var total = 0; for (var i = 0; i < v.length; i++) { total += v[i].count; } return {count:total}; } res = db.posts.mapReduce(mapFunc, reduceFunc) >db[res.result].find() { _id : "intro", value : { count : 1 } } { _id : "mongodb", value : { count : 1 } }
Review So Far: - Started out with a simple schema - Queried Data - Evolved the schema - Queried / Updated the data some more
Wordnik 9B records, 100M queries / week, 1,2TB { entry : { header: { id: 0, headword: "m", sourceDictionary: "GCide", textProns : [ {text: "(em)", seq:0} ], syllables: [ {id: 0, text: "m"} ], sourceDictionary: "1913 Webster", headWord: "m", id: 1, definitions: : [ {text: "M, the thirteenth letter..."}, {text: "As a numeral, M stands for 1000"}] } } }
Review So Far: - Started out with a simple schema - Queried Data - Evolved the schema - Queried / Updated the data some more Observations: - Using Rich Documents works well - Simplify relations by embedding them - Iterative development is easy with MongoDB
Single Table Inheritance >db.shapes.find() { _id: ObjectId("..."), type: "circle", area: 3.14, radius: 1} { _id: ObjectId("..."), type: "square", area: 4, d: 2} { _id: ObjectId("..."), type: "rect", area: 10, length: 5, width: 2} // find shapes where radius > 0 >db.shapes.find({radius: {$gt: 0}}) // create index >db.shapes.ensureIndex({radius: 1})
One to Many - Embedded Array / Array Keys - slice operator to return subset of array - hard to find latest comments across all documents
One to Many - Embedded Array / Array Keys - slice operator to return subset of array - hard to find latest comments across all documents - Embedded tree - Single document - Natural - Hard to query
One to Many - Embedded Array / Array Keys - slice operator to return subset of array - hard to find latest comments across all documents - Embedded tree - Single document - Natural - Hard to query - Normalized (2 collections) - most flexible - more queries
Many - Many Example: - Product can be in many categories - Category can have many products Products id | product_id | category_id Category
products: { _id: ObjectId("4c4ca23933fb5941681b912e"), name: "Sumatra Dark Roast", category_ids: [ ObjectId("4c4ca25433fb5941681b912f"), ObjectId("4c4ca25433fb5941681b92af”]}
products: { _id: ObjectId("4c4ca23933fb5941681b912e"), name: "Sumatra Dark Roast", category_ids: [ ObjectId("4c4ca25433fb5941681b912f"), ObjectId("4c4ca25433fb5941681b92af”]} categories: { _id: ObjectId("4c4ca25433fb5941681b912f"), name: "Indonesia", product_ids: [ ObjectId("4c4ca23933fb5941681b912e"), ObjectId("4c4ca30433fb5941681b9130"), ObjectId("4c4ca30433fb5941681b913a"]}
products: { _id: ObjectId("4c4ca23933fb5941681b912e"), name: "Sumatra Dark Roast", category_ids: [ ObjectId("4c4ca25433fb5941681b912f"), ObjectId("4c4ca25433fb5941681b92af”]} categories: { _id: ObjectId("4c4ca25433fb5941681b912f"), name: "Indonesia", product_ids: [ ObjectId("4c4ca23933fb5941681b912e"), ObjectId("4c4ca30433fb5941681b9130"), ObjectId("4c4ca30433fb5941681b913a"]} //All categories for a given product >db.categories.find({product_ids: ObjectId("4c4ca23933fb5941681b912e")})
products: { _id: ObjectId("4c4ca23933fb5941681b912e"), name: "Sumatra Dark Roast", category_ids: [ ObjectId("4c4ca25433fb5941681b912f"), ObjectId("4c4ca25433fb5941681b92af”]} categories: { _id: ObjectId("4c4ca25433fb5941681b912f"), name: "Indonesia", product_ids: [ ObjectId("4c4ca23933fb5941681b912e"), ObjectId("4c4ca30433fb5941681b9130"), ObjectId("4c4ca30433fb5941681b913a"]} //All categories for a given product >db.categories.find({product_ids: ObjectId("4c4ca23933fb5941681b912e")}) //All products for a given category >db.products.find({category_ids: ObjectId("4c4ca25433fb5941681b912f")})
Alternative products: { _id: ObjectId("4c4ca23933fb5941681b912e"), name: "Sumatra Dark Roast", category_ids: [ ObjectId("4c4ca25433fb5941681b912f"), ObjectId("4c4ca25433fb5941681b92af”]} categories: { _id: ObjectId("4c4ca25433fb5941681b912f"), name: "Indonesia"}
Alternative products: { _id: ObjectId("4c4ca23933fb5941681b912e"), name: "Sumatra Dark Roast", category_ids: [ ObjectId("4c4ca25433fb5941681b912f"), ObjectId("4c4ca25433fb5941681b92af”]} categories: { _id: ObjectId("4c4ca25433fb5941681b912f"), name: "Indonesia"} // All products for a given category >db.products.find({category_ids: ObjectId("4c4ca25433fb5941681b912f")})
Alternative products: { _id: ObjectId("4c4ca23933fb5941681b912e"), name: "Sumatra Dark Roast", category_ids: [ ObjectId("4c4ca25433fb5941681b912f"), ObjectId("4c4ca25433fb5941681b92af”]} categories: { _id: ObjectId("4c4ca25433fb5941681b912f"), name: "Indonesia"} // All products for a given category >db.products.find({category_ids: ObjectId("4c4ca25433fb5941681b912f")}) // All categories for a given product product = db.products.find(_id : some_id) >db.categories.find({_id : {$in : product.category_ids}})
Trees Full Tree in Document { comments: [ { author: “rpb”, text: “...”, replies: [ {author: “Fred”, text: “...”, replies: []} ]} ]} Pros: Single Document, Performance, Intuitive Cons: Hard to search, Partial Results, 4MB limit
Trees Parent Links - Each node is stored as a document - Contains the id of the parent Child Links - Each node contains the id’s of the children - Can support graphs (multiple parents / child)
Array of Ancestors - Store Ancestors of a node { _id: "a" } { _id: "b", ancestors: [ "a" ], parent: "a" } { _id: "c", ancestors: [ "a", "b" ], parent: "b" } { _id: "d", ancestors: [ "a", "b" ], parent: "b" } { _id: "e", ancestors: [ "a" ], parent: "a" } { _id: "f", ancestors: [ "a", "e" ], parent: "e" } { _id: "g", ancestors: [ "a", "b", "d" ], parent: "d" }
Array of Ancestors - Store Ancestors of a node { _id: "a" } { _id: "b", ancestors: [ "a" ], parent: "a" } { _id: "c", ancestors: [ "a", "b" ], parent: "b" } { _id: "d", ancestors: [ "a", "b" ], parent: "b" } { _id: "e", ancestors: [ "a" ], parent: "a" } { _id: "f", ancestors: [ "a", "e" ], parent: "e" } { _id: "g", ancestors: [ "a", "b", "d" ], parent: "d" } //find all descendants of b: >db.tree2.find({ancestors: ‘b’})
Array of Ancestors - Store Ancestors of a node { _id: "a" } { _id: "b", ancestors: [ "a" ], parent: "a" } { _id: "c", ancestors: [ "a", "b" ], parent: "b" } { _id: "d", ancestors: [ "a", "b" ], parent: "b" } { _id: "e", ancestors: [ "a" ], parent: "a" } { _id: "f", ancestors: [ "a", "e" ], parent: "e" } { _id: "g", ancestors: [ "a", "b", "d" ], parent: "d" } //find all descendants of b: >db.tree2.find({ancestors: ‘b’}) //find all ancestors of f: >ancestors = db.tree2.findOne({_id:’f’}).ancestors >db.tree2.find({_id: { $in : ancestors})
findAndModify Queue example //Example: grab highest priority job and mark job = db.jobs.findAndModify({ query: {inprogress: false}, sort: {priority: -1), update: {$set: {inprogress: true, started: new Date()}}, new: true})
More Cool Stuff • Aggregation • Capped collections • GridFS • Geo
Learn More Kyle’s presentation + video: http://www.slideshare.net/kbanker/mongodb-schema-design http://www.blip.tv/file/3704083 Dwight’s presentation http://www.slideshare.net/mongosf/schema-design-with-mongodb-dwight- merriman Documentation Trees: http://www.mongodb.org/display/DOCS/Trees+in+MongoDB Queues: http://www.mongodb.org/display/DOCS/findandmodify+Command Aggregration: http://www.mongodb.org/display/DOCS/Aggregation Capped Col. : http://www.mongodb.org/display/DOCS/Capped+Collections Geo: http://www.mongodb.org/display/DOCS/Geospatial+Indexing GridFS: http://www.mongodb.org/display/DOCS/GridFS+Specification
Thank You :-)
Download MongoDB and let us know what you think @mongodb http://www.mongodb.org
DBRef DBRef {$ref: collection, $id: id_value} - Think URL - YDSMV: your driver support may vary Sample Schema: nr = {note_refs: [{"$ref" : "notes", "$id" : 5}, ... ]} Dereferencing: nr.forEach(function(r) { printjson(db[r.$ref].findOne({_id: r.$id})); }
BSON Mongodb stores data in BSON internally Lightweight, Traversable, Efficient encoding Typed boolean, integer, float, date, string, binary, array...

Schema Design with MongoDB

  • 1.
    open-source, high-performance, document-oriented database
  • 2.
    Schema Design Basics roger@10Gen.com
  • 3.
  • 4.
    This talk ‣Intro -Terms / Definitions
  • 5.
    This talk ‣Intro -Terms / Definitions
  • 6.
    This talk ‣Intro -Terms / Definitions ‣Getting a flavor -Creating a Schema -Indexes -Evolving the Schema
  • 7.
    This talk ‣Intro -Terms / Definitions ‣Getting a flavor -Creating a Schema -Indexes -Evolving the Schema ‣Data modeling -DBRef -Single Table Inheritance -Many - Many -Trees -Lists / Queues / Stacks
  • 8.
    Document Oriented Basic unitof data: JSON Documents Not Relational, Key Value Not OODB - Associations implied by Document Structure - but your database schema != your program schema
  • 9.
    Terms Table -> Collection Row(s) -> JSON Document Index -> Index Join -> Embedding and Linking across documents Partition -> Shard Partition Key -> Shard Key
  • 10.
    Considerations What are therequirements ? - Functionality to be supported - Access Patterns ? - Data Life Cycle (insert, update, deletes) - Expected Performance / Workload ? Capabilities of the database ?
  • 11.
    DB Considerations How canwe manipulate this data ? Dynamic Queries Secondary Indexes Atomic Updates Map Reduce Access Patterns ? Read / Write Ratio Types of updates Types of queries Considerations No Joins Single Document Transactions only
  • 12.
    Design Session Use RichDesign Documents post = {author: “kyle”, date: new Date(), text: “my blog post...”, tags: [“mongodb”, “intro”]} >db.post.save(post)
  • 13.
    >db.posts.find() { _id: ObjectId("4c4ba5c0672c685e5e8aabf3"), author : "kyle", date : "Sat Jul 24 2010 19:47:11 GMT-0700 (PDT)", text : "My first blog", tags : [ "mongodb", "intro" ] } Notes: - ID is unique, but can be anything you’d like
  • 14.
    Secondary index for“author” // 1 means ascending, -1 means descending >db.posts.ensureIndex({author: 1}) >db.posts.find({author: 'kyle'}) { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : "kyle", ... }
  • 15.
    Verifying indexes exist >db.system.indexes.find() // Index on ID { name : "_id_", ns : "test.posts", key : { "_id" : 1 } }
  • 16.
    Verifying indexes exist >db.system.indexes.find() // Index on ID { name : "_id_", ns : "test.posts", key : { "_id" : 1 } } // Index on author { _id : ObjectId("4c4ba6c5672c685e5e8aabf4"), ns : "test.posts", key : { "author" : 1 }, name : "author_1" }
  • 17.
    Query operators Conditional operators: $ne, $in, $nin, $mod, $all, $size, $exists, $type, .. $lt, $lte, $gt, $gte, $ne, // find posts with any tags >db.posts.find({tags: {$exists: true}})
  • 18.
    Query operators Conditional operators: $ne, $in, $nin, $mod, $all, $size, $exists, $type, .. $lt, $lte, $gt, $gte, $ne, // find posts with any tags >db.posts.find({tags: {$exists: true}}) Regular expressions: // posts where author starts with k >db.posts.find({author: /^k*/i })
  • 19.
    Query operators Conditional operators: $ne, $in, $nin, $mod, $all, $size, $exists, $type, .. $lt, $lte, $gt, $gte, $ne, // find posts with any tags >db.posts.find({tags: {$exists: true}}) Regular expressions: // posts where author starts with k >db.posts.find({author: /^k*/i }) Counting: // posts written by mike >db.posts.find({author: “mike”}).count()
  • 20.
    Extending the Schema comment = {author: “fred”, date: new Date(), text: “super duper”} update = { ‘$push’: {comments: comment}, ‘$inc’: {comments_count: 1}} >db.posts.update({_id: “...” }, update)
  • 21.
    { _id :ObjectId("4c4ba5c0672c685e5e8aabf3"), author : "kyle", date : "Sat Jul 24 2010 19:47:11 GMT-0700 (PDT)", text : "My first blog", tags : [ "mongodb", "intro" ], comments_count: 1, comments : [ { author : "Fred", date : "Sat Jul 24 2010 20:51:03 GMT-0700 (PDT)", text : "Super Duper" } ]}
  • 22.
    // create indexon nested documents: >db.posts.ensureIndex({"comments.author": 1}) >db.posts.find({comments.author:”Fred”})
  • 23.
    // create indexon nested documents: >db.posts.ensureIndex({"comments.author": 1}) >db.posts.find({comments.author:”kyle”}) // find last 5 posts: >db.posts.find().sort({date:-1}).limit(5)
  • 24.
    // create indexon nested documents: >db.posts.ensureIndex({"comments.author": 1}) >db.posts.find({comments.author:”kyle”}) // find last 5 posts: >db.posts.find().sort({date:-1}).limit(5) // most commented post: >db.posts.find().sort({comments_count:-1}).limit(1) When sorting, check if you need an index
  • 25.
    Map Reduce Aggregation andbatch manipulation Collection in, Collection out Parallel in sharded environments
  • 26.
    Map reduce mapFunc =function () { this.tags.forEach(function (z) {emit(z, {count:1});}); } reduceFunc = function (k, v) { var total = 0; for (var i = 0; i < v.length; i++) { total += v[i].count; } return {count:total}; } res = db.posts.mapReduce(mapFunc, reduceFunc) >db[res.result].find() { _id : "intro", value : { count : 1 } } { _id : "mongodb", value : { count : 1 } }
  • 27.
    Review So Far: - Startedout with a simple schema - Queried Data - Evolved the schema - Queried / Updated the data some more
  • 28.
    Wordnik 9B records, 100M queries / week, 1,2TB { entry : { header: { id: 0, headword: "m", sourceDictionary: "GCide", textProns : [ {text: "(em)", seq:0} ], syllables: [ {id: 0, text: "m"} ], sourceDictionary: "1913 Webster", headWord: "m", id: 1, definitions: : [ {text: "M, the thirteenth letter..."}, {text: "As a numeral, M stands for 1000"}] } } }
  • 29.
    Review So Far: - Startedout with a simple schema - Queried Data - Evolved the schema - Queried / Updated the data some more Observations: - Using Rich Documents works well - Simplify relations by embedding them - Iterative development is easy with MongoDB
  • 31.
    Single Table Inheritance >db.shapes.find() { _id: ObjectId("..."), type: "circle", area: 3.14, radius: 1} { _id: ObjectId("..."), type: "square", area: 4, d: 2} { _id: ObjectId("..."), type: "rect", area: 10, length: 5, width: 2} // find shapes where radius > 0 >db.shapes.find({radius: {$gt: 0}}) // create index >db.shapes.ensureIndex({radius: 1})
  • 32.
    One to Many -Embedded Array / Array Keys - slice operator to return subset of array - hard to find latest comments across all documents
  • 33.
    One to Many -Embedded Array / Array Keys - slice operator to return subset of array - hard to find latest comments across all documents - Embedded tree - Single document - Natural - Hard to query
  • 34.
    One to Many -Embedded Array / Array Keys - slice operator to return subset of array - hard to find latest comments across all documents - Embedded tree - Single document - Natural - Hard to query - Normalized (2 collections) - most flexible - more queries
  • 35.
    Many - Many Example: -Product can be in many categories - Category can have many products Products id | product_id | category_id Category
  • 36.
    products: {_id: ObjectId("4c4ca23933fb5941681b912e"), name: "Sumatra Dark Roast", category_ids: [ ObjectId("4c4ca25433fb5941681b912f"), ObjectId("4c4ca25433fb5941681b92af”]}
  • 37.
    products: {_id: ObjectId("4c4ca23933fb5941681b912e"), name: "Sumatra Dark Roast", category_ids: [ ObjectId("4c4ca25433fb5941681b912f"), ObjectId("4c4ca25433fb5941681b92af”]} categories: { _id: ObjectId("4c4ca25433fb5941681b912f"), name: "Indonesia", product_ids: [ ObjectId("4c4ca23933fb5941681b912e"), ObjectId("4c4ca30433fb5941681b9130"), ObjectId("4c4ca30433fb5941681b913a"]}
  • 38.
    products: {_id: ObjectId("4c4ca23933fb5941681b912e"), name: "Sumatra Dark Roast", category_ids: [ ObjectId("4c4ca25433fb5941681b912f"), ObjectId("4c4ca25433fb5941681b92af”]} categories: { _id: ObjectId("4c4ca25433fb5941681b912f"), name: "Indonesia", product_ids: [ ObjectId("4c4ca23933fb5941681b912e"), ObjectId("4c4ca30433fb5941681b9130"), ObjectId("4c4ca30433fb5941681b913a"]} //All categories for a given product >db.categories.find({product_ids: ObjectId("4c4ca23933fb5941681b912e")})
  • 39.
    products: {_id: ObjectId("4c4ca23933fb5941681b912e"), name: "Sumatra Dark Roast", category_ids: [ ObjectId("4c4ca25433fb5941681b912f"), ObjectId("4c4ca25433fb5941681b92af”]} categories: { _id: ObjectId("4c4ca25433fb5941681b912f"), name: "Indonesia", product_ids: [ ObjectId("4c4ca23933fb5941681b912e"), ObjectId("4c4ca30433fb5941681b9130"), ObjectId("4c4ca30433fb5941681b913a"]} //All categories for a given product >db.categories.find({product_ids: ObjectId("4c4ca23933fb5941681b912e")}) //All products for a given category >db.products.find({category_ids: ObjectId("4c4ca25433fb5941681b912f")})
  • 40.
    Alternative products: {_id: ObjectId("4c4ca23933fb5941681b912e"), name: "Sumatra Dark Roast", category_ids: [ ObjectId("4c4ca25433fb5941681b912f"), ObjectId("4c4ca25433fb5941681b92af”]} categories: { _id: ObjectId("4c4ca25433fb5941681b912f"), name: "Indonesia"}
  • 41.
    Alternative products: {_id: ObjectId("4c4ca23933fb5941681b912e"), name: "Sumatra Dark Roast", category_ids: [ ObjectId("4c4ca25433fb5941681b912f"), ObjectId("4c4ca25433fb5941681b92af”]} categories: { _id: ObjectId("4c4ca25433fb5941681b912f"), name: "Indonesia"} // All products for a given category >db.products.find({category_ids: ObjectId("4c4ca25433fb5941681b912f")})
  • 42.
    Alternative products: {_id: ObjectId("4c4ca23933fb5941681b912e"), name: "Sumatra Dark Roast", category_ids: [ ObjectId("4c4ca25433fb5941681b912f"), ObjectId("4c4ca25433fb5941681b92af”]} categories: { _id: ObjectId("4c4ca25433fb5941681b912f"), name: "Indonesia"} // All products for a given category >db.products.find({category_ids: ObjectId("4c4ca25433fb5941681b912f")}) // All categories for a given product product = db.products.find(_id : some_id) >db.categories.find({_id : {$in : product.category_ids}})
  • 43.
    Trees Full Tree inDocument { comments: [ { author: “rpb”, text: “...”, replies: [ {author: “Fred”, text: “...”, replies: []} ]} ]} Pros: Single Document, Performance, Intuitive Cons: Hard to search, Partial Results, 4MB limit
  • 44.
    Trees Parent Links - Eachnode is stored as a document - Contains the id of the parent Child Links - Each node contains the id’s of the children - Can support graphs (multiple parents / child)
  • 45.
    Array of Ancestors -Store Ancestors of a node { _id: "a" } { _id: "b", ancestors: [ "a" ], parent: "a" } { _id: "c", ancestors: [ "a", "b" ], parent: "b" } { _id: "d", ancestors: [ "a", "b" ], parent: "b" } { _id: "e", ancestors: [ "a" ], parent: "a" } { _id: "f", ancestors: [ "a", "e" ], parent: "e" } { _id: "g", ancestors: [ "a", "b", "d" ], parent: "d" }
  • 46.
    Array of Ancestors -Store Ancestors of a node { _id: "a" } { _id: "b", ancestors: [ "a" ], parent: "a" } { _id: "c", ancestors: [ "a", "b" ], parent: "b" } { _id: "d", ancestors: [ "a", "b" ], parent: "b" } { _id: "e", ancestors: [ "a" ], parent: "a" } { _id: "f", ancestors: [ "a", "e" ], parent: "e" } { _id: "g", ancestors: [ "a", "b", "d" ], parent: "d" } //find all descendants of b: >db.tree2.find({ancestors: ‘b’})
  • 47.
    Array of Ancestors -Store Ancestors of a node { _id: "a" } { _id: "b", ancestors: [ "a" ], parent: "a" } { _id: "c", ancestors: [ "a", "b" ], parent: "b" } { _id: "d", ancestors: [ "a", "b" ], parent: "b" } { _id: "e", ancestors: [ "a" ], parent: "a" } { _id: "f", ancestors: [ "a", "e" ], parent: "e" } { _id: "g", ancestors: [ "a", "b", "d" ], parent: "d" } //find all descendants of b: >db.tree2.find({ancestors: ‘b’}) //find all ancestors of f: >ancestors = db.tree2.findOne({_id:’f’}).ancestors >db.tree2.find({_id: { $in : ancestors})
  • 48.
    findAndModify Queue example //Example: grabhighest priority job and mark job = db.jobs.findAndModify({ query: {inprogress: false}, sort: {priority: -1), update: {$set: {inprogress: true, started: new Date()}}, new: true})
  • 49.
    More Cool Stuff •Aggregation • Capped collections • GridFS • Geo
  • 50.
    Learn More Kyle’spresentation + video: http://www.slideshare.net/kbanker/mongodb-schema-design http://www.blip.tv/file/3704083 Dwight’s presentation http://www.slideshare.net/mongosf/schema-design-with-mongodb-dwight- merriman Documentation Trees: http://www.mongodb.org/display/DOCS/Trees+in+MongoDB Queues: http://www.mongodb.org/display/DOCS/findandmodify+Command Aggregration: http://www.mongodb.org/display/DOCS/Aggregation Capped Col. : http://www.mongodb.org/display/DOCS/Capped+Collections Geo: http://www.mongodb.org/display/DOCS/Geospatial+Indexing GridFS: http://www.mongodb.org/display/DOCS/GridFS+Specification
  • 51.
  • 52.
    Download MongoDB and let us know what you think @mongodb http://www.mongodb.org
  • 53.
    DBRef DBRef {$ref: collection,$id: id_value} - Think URL - YDSMV: your driver support may vary Sample Schema: nr = {note_refs: [{"$ref" : "notes", "$id" : 5}, ... ]} Dereferencing: nr.forEach(function(r) { printjson(db[r.$ref].findOne({_id: r.$id})); }
  • 54.
    BSON Mongodb stores datain BSON internally Lightweight, Traversable, Efficient encoding Typed boolean, integer, float, date, string, binary, array...

Editor's Notes