Modeling JSON data for document databases Ryan CrawCour Program Manager, Microsoft @ryancrawcour David Makogon Cloud Architect, Microsoft @dmakogon
Today’s talk • What are document databases? • What is Azure DocumentDB? • Modeling data for a document database Loud applause and lots of great tweets about #DocumentDB @ #CloudDevelop !
Kinds of databases • Relational • Column • Key Value • Graph • Document
What are document databases?
Document Databases • Part of NoSQL family • Built for simplicity • Built for scale and performance • Non-relational • No enforced schema { "name": "SmugMug", "permalink": "smugmug", "homepage_url": "http://www.smugmug.com", "blog_url": "http://blogs.smugmug.com/", "category_code": "photo_video", "products": [ { "name": "SmugMug", "permalink": "smugmug" } ], "offices": [ { "description": "", "address1": "67 E. Even Ave, Suite 200", "address2": "", "zip_code": "94041", "city": "Mountain View", "state_code": "CA", "country_code": "USA", "latitude": 37.390056, "longitude": -122.067692 } ] }
Document Databases { “id": “itemdata2344", “data": “TWFuIGlzIGRpc3Rpbmd1aXNoZWQsIG5vd cyByZWFzb24sIGJ1dCBieSB0aGlzHNpbmd1bG nJvbSBvdGhlciBhbmltYWxzLCB3aGljaCBpcyYg dGhlIG1pbmQsIHRoYXQgYnkgYSBwZXJzZXZlsaW dodCBpbiB0aGUgY29udGludWVkIGFuZCBpbGdl bmVyYXRpb24gb2Yga25vd2xlZGdlLCBleGNlZ9y dCB2ZWhlbWVuY2Ugb2YgYW55IGNhcm5hS4= cyByZWFzb24sIGJ1dCBieSB0aGlzHNpbmd1bGFyIZ nJvbSBvdGhlciBhbmltYWxzLCB3aGljaCBpcyBh2Yg dGhlIG1pbmQsIHRoYXQgYnkgYSBwZXJzZXZlGVsaW dodCBpbiB0aGUgY29udGludWVkIGFuZCBpbmRlZGdl bmVyYXRpb24gb2Yga25vd2xlZGdlLCBleGNlZWG9y dCB2ZWhlbWVuY2Ugb2YgYW55IGNhcm5hbS4= cyByZWFzb24sIGJ1dCBieSB0aGlzHNpbmd1bGF4gZ nJvbSBvdGhlciBhbmltYWxzLCB3aGljaCBpg dGhlIG1pbmQsIHRoYXQgYnkgYSBwZXJzZXZlcmVsaW dodCBpbiB0aGUgY29udGludWVkIGFuZCBpbmRlIGdl bmVyYXRpb24gb2Yga25vd2xlZGdlLCBleGNlZzaG9y dCB2ZWhlbWVuY2Ugb2YgYW55IGNhcm5hbCBwZS4=” } • Part of NoSQL family • Built for simplicity • Built for scale and performance • Non-relational • No enforced schema
Document Databases • Part of NoSQL family • Built for simplicity • Built for scale and performance • Non-relational • No enforced schema
Azure DocumentDB: Lightning Round Edition { name:"Azure DocumentDB", deployedAs: "Service", dbType: "Document", connectVia: [ "rest", "sdk" ], deployVia: [ "portal", "rest", "cli", "sdk" ], scaleVia: [ "portal", "rest", "cli", "sdk" ], differsVia: [ "js", "indexing", "consistency" ] }
Modeling JSON data in this brave "new" world
Modeling data, the relational way
Come as you are Data normalization How do approaches differ?
To embed, or to reference, that is the question embed reference
To embed, or to reference, that is the question • Data from entities are queried together
To embed, or to reference, that is the question • Data from entities are queried together
To embed, or to reference, that is the question • Data from entities are queried together { id: "book1", covers: [ {type: "front", artworkUrl: "http://..."}, {type: "back", artworkUrl: "http://..."} ] index: "", chapters: [ {id: 1, synopsis: "", quote: "", pageCount:24, wordCount:456}, {id: 1, synopsis: "", quote: "", pageCount:24, wordCount:456}, ] }
To embed, or to reference, that is the question • Data from entities are queried together • The child is a dependent e.g. Order Line depends on Order { id: "order1", customer: "customer1", orderDate: "2014-09-15T23:14:25.7251173Z" lines: [ {product: "13inch screen" , price: 200.00, qty: 50 }, {product: "Keyboard", price:23.67, qty:4} {product: "CPU", price:87.89, qty:1 ] }
To embed, or to reference, that is the question • Data from entities are queried together • The child is a dependent e.g. Order Line depends on Order • 1:1 relationship { id: "person1", name: "Mickey" creditCard: { number: "**** **** **** 4794"}, expiry: "06/2019"}, cvv: "868", type: "Mastercard" } }
To embed, or to reference, that is the question • Data from entities are queried together • The child is a dependent e.g. Order Line depends on Order • 1:1 relationship • Similar volatility { id: "person1", name: "Mickey", contactInfo: [ {email: "mickey@disney.com"}, {mobile: "+1 555-5555"}, {twitter: "@MickeyMouse"} ] }
To embed, or to reference, that is the question • Data from entities are queried together • The child is a dependent e.g. Order Line depends on Order • 1:1 relationship • Similar volatility • The set of values or sub-documents is bounded (1:few) { id: "task1", desc: "deliver an awesome presentation @ #CloudDevelop", categories: ["conference", "talk", "workshop", "business"] }
To embed, or to reference, that is the question • Data from entities are queried together • The child is a dependent e.g. Order Line depends on Order • 1:1 relationship • Similar volatility • The set of values or sub-documents is bounded (1:few) Typically denormalized data models provide better read performance
To embed, or to reference, that is the question • one-to-many relationships (unbounded) { id: "post1", author: "Mickey Mouse", tags: [ "fun", "cloud", "develop"] } {id: "c1", postId: "post1", comment: "Coolest blog post"} {id: "c2", postId: "post1", comment: "Loved this post, awesome"} {id: "c3", postId: "post1", comment: "This is rad!"} … {id: "c10000", postId: "post1", comment: "You are the coolest cartoon character"} … {id: "c2000000", postId: "post1", comment: "Are we still commeting on this blog?"}
To embed, or to reference, that is the question • one-to-many relationships (unbounded) • many-to-many relationships { id: "book1", name: "100 Secrets of Disneyland" } { id: "book2", name: "The best places to eat @ Disney" } { author-id: "author1", book-id: "book1" } { author-id: "author2", book-id: "book1" } { id: "author1", name: "Mickey Mouse" } { id: "author2", name: "Donald Duck" } Look familiar? It should …. It's the "relational" way
To embed, or to reference, that is the question • one-to-many relationships (unbounded) • many-to-many relationships { id: "book1", name: "100 Secrets of Disneyland", authors: ["author1", "author2"] } { id: "book2", name: "The best places to eat @ Disney”, authors: ["author1"] } { id: "author1", name: "Mickey Mouse", books: ["book1", "book2"] } { id: "author2", name: "Donald Duck" books: ["book1"] }
To embed, or to reference, that is the question • one-to-many relationships (unbounded) • many-to-many relationships • Related data changes frequently • The referenced entity is a key entity used by many others { id: "person1", author: "Mickey Mouse", stocks: [ "dis", "msft", "nflx"] } { id: "dis", opening: "52.09", numerOfTrades: 10000, trades: [{time: 083745, qty:57, price: 53.97}, {time: 083746, qty:5, price: 54.01}] }
To embed, or to reference, that is the question • one-to-many relationships (unbounded) • many-to-many relationships • Related data changes frequently • The referenced entity is a key entity used by many others Normalized data models can require more round trips to the server. Typically normalizing provides better write performance.
Where do you put the reference? Publisher & Book … does publisher refer to book? Publisher document: { id: "mspress", name: "Microsoft Press", books: [ 1, 2, 3, ..., 100, ..., 1000] } Book documents: {id: 1, name: "DocumentDB 101" } {id: 2, name: "DocumentDB for RDBMS Users" } {id: 3, name: "Taking over the world one JSON doc at a time" }
Where do you put the reference? Publisher & Book … does or book refer to publisher? Publisher document: { id: "mspress", name: "Microsoft Press", books: [ 1, 2, 3, ..., 100, ..., 1000] } Book documents: {id: 1, name: "DocumentDB 101", pub-id: "mspress"} {id: 2, name: "DocumentDB for RDBMS Users", pub-id: "mspress"} {id: 3, name: "Taking over the world one JSON doc at a time", pub-id: "mspress"}
Is it always black or white?
Is it always black or white?
Is it always black or white? { id: 1, firstName: "Mickey", lastName: "Mouse", books: [1, 2, 3], images: [ {"thumbnail": "http://....png"}, {"profile": "http://....png"}, ], bio: "Mickey Mouse is a funny animal cartoon character and the official mascot of The Walt Disney Company. An anthropomorphic mouse who typically wears red shorts, large yellow shoes, and white gloves, Mickey has become one of the most recognizable cartoon characters." } { id: 1, name: "DocumentDB 101", authors": [ { id: 1, name: "Mickey Mouse", bio: "Mickey Mouse is a funny animal cartoon character and the official mascot of The Walt Disney Company…", thumbnailUrl: "http://....png" } ] }
How to model hierarchical trees? Jill Ben Susan SvenAndrew Thomas { { id: "Jill" }, { id: "Ben", manager: "Jill" }, { id: "Susan", manager: "Jill" }, { id: "Andrew", manager: "Ben" }, { id: "Sven", manager: "Susan" }, { id: "Thomas", manager: "Sven" } } SELECT manager FROM org WHERE id = "Susan" To get the manager of any employee is trivial -
How to model hierarchical trees? Jill Ben Susan SvenAndrew Thomas { { id: "Jill" }, { id: "Ben", manager: "Jill" }, { id: "Susan", manager: "Jill" }, { id: "Andrew", manager: "Ben" }, { id: "Sven", manager: "Susan" }, { id: "Thomas", manager: "Sven" } } SELECT * FROM org WHERE manager = "Jill" To get all employees where Jill is the manager is also easy -
How to model hierarchical trees? Jill Ben Susan SvenAndrew Thomas { { id: "Jill", directs: ["Ben", "Susan"] }, { id: "Ben", directs: ["Andrew"] }, { id: "Susan", directs: ["Sven"] }, { id: "Andrew" }, { id: "Sven", directs: ["Thomas"] }, { id: "Thomas" } } SELECT * FROM org WHERE id = "Jill" To get all direct reports for Jill is easy -
How to model hierarchical trees? Jill Ben Susan SvenAndrew Thomas { { id: "Jill", directs: ["Ben", "Susan"] }, { id: "Ben", directs: ["Andrew"] }, { id: "Susan", directs: ["Sven"] }, { id: "Andrew" }, { id: "Sven", directs: ["Thomas"] }, { id: "Thomas" } } SELECT * FROM emp WHERE ARRAY_CONTAINS(emp.directs, "Ben") To find the manager for an employee is possible -
How to support keyword search? { id: "CDC101", title: "Fundamentals of database design", credits: 10 } }
How to support keyword search? { id: "CDC101", title: “The Fundamentals of Database Design", titleWords: [ "fundamentals", "database", "design", "database design" ], credits: 10 } Consider using a RegEx to transform words to lowercase and remove any punctuation. Strip out stop words like “to”, “the”, “of” etc. Denormalize keywords in to key phrases
Summary
{ options: ["Embed", "Reference"], rules: "There are no rules, merely guidelines", embed: [ "1:1", "Child is a dependent", "Similar volatility", "favor read speed" ] reference: [ "related data changes frequently", "many:many", "favor writes" ] remember: [ "Don't be scared to experiment and mix & match", "Models change & evolve", "Hybrid models" ] } Summary
Azure DocumentDB SDKs and Tooling SDKs aka.ms/docdbsdks Azure Portal portal.azure.com Studio aka.ms/docdbstudio
Get Started Today explore playground select * from playground p where p.name = "DocumentDB" aka.ms/docdbplayground build an app aka.ms/docdbstarter move some data aka.ms/docdbimport
http://aka.ms/CloudDevelop • Dell Venue Pro 8 • Enter by filling out survey • Announced at the end of the day. • Must be present to win.
Wrapping up • documentdb.com • @DocumentDB • @dmakogon • @ryancrawcour

Modeling JSON data for NoSQL document databases

  • 1.
    Modeling JSON datafor document databases Ryan CrawCour Program Manager, Microsoft @ryancrawcour David Makogon Cloud Architect, Microsoft @dmakogon
  • 2.
    Today’s talk • Whatare document databases? • What is Azure DocumentDB? • Modeling data for a document database Loud applause and lots of great tweets about #DocumentDB @ #CloudDevelop !
  • 3.
    Kinds of databases •Relational • Column • Key Value • Graph • Document
  • 4.
  • 5.
    Document Databases • Partof NoSQL family • Built for simplicity • Built for scale and performance • Non-relational • No enforced schema { "name": "SmugMug", "permalink": "smugmug", "homepage_url": "http://www.smugmug.com", "blog_url": "http://blogs.smugmug.com/", "category_code": "photo_video", "products": [ { "name": "SmugMug", "permalink": "smugmug" } ], "offices": [ { "description": "", "address1": "67 E. Even Ave, Suite 200", "address2": "", "zip_code": "94041", "city": "Mountain View", "state_code": "CA", "country_code": "USA", "latitude": 37.390056, "longitude": -122.067692 } ] }
  • 6.
    Document Databases { “id": “itemdata2344", “data":“TWFuIGlzIGRpc3Rpbmd1aXNoZWQsIG5vd cyByZWFzb24sIGJ1dCBieSB0aGlzHNpbmd1bG nJvbSBvdGhlciBhbmltYWxzLCB3aGljaCBpcyYg dGhlIG1pbmQsIHRoYXQgYnkgYSBwZXJzZXZlsaW dodCBpbiB0aGUgY29udGludWVkIGFuZCBpbGdl bmVyYXRpb24gb2Yga25vd2xlZGdlLCBleGNlZ9y dCB2ZWhlbWVuY2Ugb2YgYW55IGNhcm5hS4= cyByZWFzb24sIGJ1dCBieSB0aGlzHNpbmd1bGFyIZ nJvbSBvdGhlciBhbmltYWxzLCB3aGljaCBpcyBh2Yg dGhlIG1pbmQsIHRoYXQgYnkgYSBwZXJzZXZlGVsaW dodCBpbiB0aGUgY29udGludWVkIGFuZCBpbmRlZGdl bmVyYXRpb24gb2Yga25vd2xlZGdlLCBleGNlZWG9y dCB2ZWhlbWVuY2Ugb2YgYW55IGNhcm5hbS4= cyByZWFzb24sIGJ1dCBieSB0aGlzHNpbmd1bGF4gZ nJvbSBvdGhlciBhbmltYWxzLCB3aGljaCBpg dGhlIG1pbmQsIHRoYXQgYnkgYSBwZXJzZXZlcmVsaW dodCBpbiB0aGUgY29udGludWVkIGFuZCBpbmRlIGdl bmVyYXRpb24gb2Yga25vd2xlZGdlLCBleGNlZzaG9y dCB2ZWhlbWVuY2Ugb2YgYW55IGNhcm5hbCBwZS4=” } • Part of NoSQL family • Built for simplicity • Built for scale and performance • Non-relational • No enforced schema
  • 7.
    Document Databases • Partof NoSQL family • Built for simplicity • Built for scale and performance • Non-relational • No enforced schema
  • 8.
    Azure DocumentDB: LightningRound Edition { name:"Azure DocumentDB", deployedAs: "Service", dbType: "Document", connectVia: [ "rest", "sdk" ], deployVia: [ "portal", "rest", "cli", "sdk" ], scaleVia: [ "portal", "rest", "cli", "sdk" ], differsVia: [ "js", "indexing", "consistency" ] }
  • 9.
    Modeling JSON data inthis brave "new" world
  • 10.
    Modeling data, therelational way
  • 11.
    Come as youare Data normalization How do approaches differ?
  • 12.
    To embed, orto reference, that is the question embed reference
  • 13.
    To embed, orto reference, that is the question • Data from entities are queried together
  • 14.
    To embed, orto reference, that is the question • Data from entities are queried together
  • 15.
    To embed, orto reference, that is the question • Data from entities are queried together { id: "book1", covers: [ {type: "front", artworkUrl: "http://..."}, {type: "back", artworkUrl: "http://..."} ] index: "", chapters: [ {id: 1, synopsis: "", quote: "", pageCount:24, wordCount:456}, {id: 1, synopsis: "", quote: "", pageCount:24, wordCount:456}, ] }
  • 16.
    To embed, orto reference, that is the question • Data from entities are queried together • The child is a dependent e.g. Order Line depends on Order { id: "order1", customer: "customer1", orderDate: "2014-09-15T23:14:25.7251173Z" lines: [ {product: "13inch screen" , price: 200.00, qty: 50 }, {product: "Keyboard", price:23.67, qty:4} {product: "CPU", price:87.89, qty:1 ] }
  • 17.
    To embed, orto reference, that is the question • Data from entities are queried together • The child is a dependent e.g. Order Line depends on Order • 1:1 relationship { id: "person1", name: "Mickey" creditCard: { number: "**** **** **** 4794"}, expiry: "06/2019"}, cvv: "868", type: "Mastercard" } }
  • 18.
    To embed, orto reference, that is the question • Data from entities are queried together • The child is a dependent e.g. Order Line depends on Order • 1:1 relationship • Similar volatility { id: "person1", name: "Mickey", contactInfo: [ {email: "mickey@disney.com"}, {mobile: "+1 555-5555"}, {twitter: "@MickeyMouse"} ] }
  • 19.
    To embed, orto reference, that is the question • Data from entities are queried together • The child is a dependent e.g. Order Line depends on Order • 1:1 relationship • Similar volatility • The set of values or sub-documents is bounded (1:few) { id: "task1", desc: "deliver an awesome presentation @ #CloudDevelop", categories: ["conference", "talk", "workshop", "business"] }
  • 20.
    To embed, orto reference, that is the question • Data from entities are queried together • The child is a dependent e.g. Order Line depends on Order • 1:1 relationship • Similar volatility • The set of values or sub-documents is bounded (1:few) Typically denormalized data models provide better read performance
  • 21.
    To embed, orto reference, that is the question • one-to-many relationships (unbounded) { id: "post1", author: "Mickey Mouse", tags: [ "fun", "cloud", "develop"] } {id: "c1", postId: "post1", comment: "Coolest blog post"} {id: "c2", postId: "post1", comment: "Loved this post, awesome"} {id: "c3", postId: "post1", comment: "This is rad!"} … {id: "c10000", postId: "post1", comment: "You are the coolest cartoon character"} … {id: "c2000000", postId: "post1", comment: "Are we still commeting on this blog?"}
  • 22.
    To embed, orto reference, that is the question • one-to-many relationships (unbounded) • many-to-many relationships { id: "book1", name: "100 Secrets of Disneyland" } { id: "book2", name: "The best places to eat @ Disney" } { author-id: "author1", book-id: "book1" } { author-id: "author2", book-id: "book1" } { id: "author1", name: "Mickey Mouse" } { id: "author2", name: "Donald Duck" } Look familiar? It should …. It's the "relational" way
  • 23.
    To embed, orto reference, that is the question • one-to-many relationships (unbounded) • many-to-many relationships { id: "book1", name: "100 Secrets of Disneyland", authors: ["author1", "author2"] } { id: "book2", name: "The best places to eat @ Disney”, authors: ["author1"] } { id: "author1", name: "Mickey Mouse", books: ["book1", "book2"] } { id: "author2", name: "Donald Duck" books: ["book1"] }
  • 24.
    To embed, orto reference, that is the question • one-to-many relationships (unbounded) • many-to-many relationships • Related data changes frequently • The referenced entity is a key entity used by many others { id: "person1", author: "Mickey Mouse", stocks: [ "dis", "msft", "nflx"] } { id: "dis", opening: "52.09", numerOfTrades: 10000, trades: [{time: 083745, qty:57, price: 53.97}, {time: 083746, qty:5, price: 54.01}] }
  • 25.
    To embed, orto reference, that is the question • one-to-many relationships (unbounded) • many-to-many relationships • Related data changes frequently • The referenced entity is a key entity used by many others Normalized data models can require more round trips to the server. Typically normalizing provides better write performance.
  • 26.
    Where do youput the reference? Publisher & Book … does publisher refer to book? Publisher document: { id: "mspress", name: "Microsoft Press", books: [ 1, 2, 3, ..., 100, ..., 1000] } Book documents: {id: 1, name: "DocumentDB 101" } {id: 2, name: "DocumentDB for RDBMS Users" } {id: 3, name: "Taking over the world one JSON doc at a time" }
  • 27.
    Where do youput the reference? Publisher & Book … does or book refer to publisher? Publisher document: { id: "mspress", name: "Microsoft Press", books: [ 1, 2, 3, ..., 100, ..., 1000] } Book documents: {id: 1, name: "DocumentDB 101", pub-id: "mspress"} {id: 2, name: "DocumentDB for RDBMS Users", pub-id: "mspress"} {id: 3, name: "Taking over the world one JSON doc at a time", pub-id: "mspress"}
  • 28.
    Is it alwaysblack or white?
  • 29.
    Is it alwaysblack or white?
  • 30.
    Is it alwaysblack or white? { id: 1, firstName: "Mickey", lastName: "Mouse", books: [1, 2, 3], images: [ {"thumbnail": "http://....png"}, {"profile": "http://....png"}, ], bio: "Mickey Mouse is a funny animal cartoon character and the official mascot of The Walt Disney Company. An anthropomorphic mouse who typically wears red shorts, large yellow shoes, and white gloves, Mickey has become one of the most recognizable cartoon characters." } { id: 1, name: "DocumentDB 101", authors": [ { id: 1, name: "Mickey Mouse", bio: "Mickey Mouse is a funny animal cartoon character and the official mascot of The Walt Disney Company…", thumbnailUrl: "http://....png" } ] }
  • 31.
    How to modelhierarchical trees? Jill Ben Susan SvenAndrew Thomas { { id: "Jill" }, { id: "Ben", manager: "Jill" }, { id: "Susan", manager: "Jill" }, { id: "Andrew", manager: "Ben" }, { id: "Sven", manager: "Susan" }, { id: "Thomas", manager: "Sven" } } SELECT manager FROM org WHERE id = "Susan" To get the manager of any employee is trivial -
  • 32.
    How to modelhierarchical trees? Jill Ben Susan SvenAndrew Thomas { { id: "Jill" }, { id: "Ben", manager: "Jill" }, { id: "Susan", manager: "Jill" }, { id: "Andrew", manager: "Ben" }, { id: "Sven", manager: "Susan" }, { id: "Thomas", manager: "Sven" } } SELECT * FROM org WHERE manager = "Jill" To get all employees where Jill is the manager is also easy -
  • 33.
    How to modelhierarchical trees? Jill Ben Susan SvenAndrew Thomas { { id: "Jill", directs: ["Ben", "Susan"] }, { id: "Ben", directs: ["Andrew"] }, { id: "Susan", directs: ["Sven"] }, { id: "Andrew" }, { id: "Sven", directs: ["Thomas"] }, { id: "Thomas" } } SELECT * FROM org WHERE id = "Jill" To get all direct reports for Jill is easy -
  • 34.
    How to modelhierarchical trees? Jill Ben Susan SvenAndrew Thomas { { id: "Jill", directs: ["Ben", "Susan"] }, { id: "Ben", directs: ["Andrew"] }, { id: "Susan", directs: ["Sven"] }, { id: "Andrew" }, { id: "Sven", directs: ["Thomas"] }, { id: "Thomas" } } SELECT * FROM emp WHERE ARRAY_CONTAINS(emp.directs, "Ben") To find the manager for an employee is possible -
  • 35.
    How to supportkeyword search? { id: "CDC101", title: "Fundamentals of database design", credits: 10 } }
  • 36.
    How to supportkeyword search? { id: "CDC101", title: “The Fundamentals of Database Design", titleWords: [ "fundamentals", "database", "design", "database design" ], credits: 10 } Consider using a RegEx to transform words to lowercase and remove any punctuation. Strip out stop words like “to”, “the”, “of” etc. Denormalize keywords in to key phrases
  • 37.
  • 38.
    { options: ["Embed", "Reference"], rules:"There are no rules, merely guidelines", embed: [ "1:1", "Child is a dependent", "Similar volatility", "favor read speed" ] reference: [ "related data changes frequently", "many:many", "favor writes" ] remember: [ "Don't be scared to experiment and mix & match", "Models change & evolve", "Hybrid models" ] } Summary
  • 39.
    Azure DocumentDB SDKsand Tooling SDKs aka.ms/docdbsdks Azure Portal portal.azure.com Studio aka.ms/docdbstudio
  • 40.
    Get Started Today exploreplayground select * from playground p where p.name = "DocumentDB" aka.ms/docdbplayground build an app aka.ms/docdbstarter move some data aka.ms/docdbimport
  • 41.
    http://aka.ms/CloudDevelop • Dell VenuePro 8 • Enter by filling out survey • Announced at the end of the day. • Must be present to win.
  • 42.
    Wrapping up • documentdb.com •@DocumentDB • @dmakogon • @ryancrawcour

Editor's Notes

  • #5 Jump to portal to show features
  • #9 Jump to portal to show features
  • #10 Jump to portal to show features
  • #13  instead of taking the business subject / domain entity and breaking it up into multiple relational structures store the business subject in the minimal number of documents.
  • #14 Add diagram showing the differences
  • #23 e.g. application for efficient data entry of product orders Order contains over 100 fields Order Line contains over 50 fields Product contains over 150 fields An order on avg contains 20 lines - Would you embed order lines on to order? Yes, Order Line is dependant on Order. Most of the time Order & Order line would be read together - Would you embed product on to order line? No. Perhaps, but maybe just the info from product we need. - What about requirement to support effecient data entry? Would having Order Line seperate be more effecient from a data entry point of view? - What about the number of Order Lines? 20 is not unbounded.
  • #24 e.g. application for efficient data entry of product orders Order contains over 100 fields Order Line contains over 50 fields Product contains over 150 fields An order on avg contains 20 lines - Would you embed order lines on to order? Yes, Order Line is dependant on Order. Most of the time Order & Order line would be read together - Would you embed product on to order line? No. Perhaps, but maybe just the info from product we need. - What about requirement to support effecient data entry? Would having Order Line seperate be more effecient from a data entry point of view? - What about the number of Order Lines? 20 is not unbounded.
  • #25 e.g. application for efficient data entry of product orders Order contains over 100 fields Order Line contains over 50 fields Product contains over 150 fields An order on avg contains 20 lines - Would you embed order lines on to order? Yes, Order Line is dependant on Order. Most of the time Order & Order line would be read together - Would you embed product on to order line? No. Perhaps, but maybe just the info from product we need. - What about requirement to support effecient data entry? Would having Order Line seperate be more effecient from a data entry point of view? - What about the number of Order Lines? 20 is not unbounded.
  • #26 e.g. application for efficient data entry of product orders Order contains over 100 fields Order Line contains over 50 fields Product contains over 150 fields An order on avg contains 20 lines - Would you embed order lines on to order? Yes, Order Line is dependant on Order. Most of the time Order & Order line would be read together - Would you embed product on to order line? No. Perhaps, but maybe just the info from product we need. - What about requirement to support effecient data entry? Would having Order Line seperate be more effecient from a data entry point of view? - What about the number of Order Lines? 20 is not unbounded.
  • #27 e.g. application for efficient data entry of product orders Order contains over 100 fields Order Line contains over 50 fields Product contains over 150 fields An order on avg contains 20 lines - Would you embed order lines on to order? Yes, Order Line is dependant on Order. Most of the time Order & Order line would be read together - Would you embed product on to order line? No. Perhaps, but maybe just the info from product we need. - What about requirement to support effecient data entry? Would having Order Line seperate be more effecient from a data entry point of view? - What about the number of Order Lines? 20 is not unbounded.
  • #30 Denormalize some data and reference other data for a hybrid model Remember, model as your application is going to use it The danger of this is if the author changes their name or bio then you need to go update every book they have authored Luckily DocumentDB allows multi-document transactions so this would be possible to do in a single atomic transaction Similarly if the author changed their thumbnail picture you need to go update every book, or would you? You might want to keep the image of what the author looked like when a particular book was published. So here denormalizing is actually useful, because you get a snapshot in time unlike pure referencing
  • #31 Denormalize some data and reference other data for a hybrid model Remember, model as your application is going to use it The danger of this is if the author changes their name or bio then you need to go update every book they have authored Luckily DocumentDB allows multi-document transactions so this would be possible to do in a single atomic transaction Similarly if the author changed their thumbnail picture you need to go update every book, or would you? You might want to keep the image of what the author looked like when a particular book was published. So here denormalizing is actually useful, because you get a snapshot in time unlike pure referencing
  • #32 Denormalize some data and reference other data for a hybrid model Remember, model as your application is going to use it The danger of this is if the author changes their name or bio then you need to go update every book they have authored Luckily DocumentDB allows multi-document transactions so this would be possible to do in a single atomic transaction Similarly if the author changed their thumbnail picture you need to go update every book, or would you? You might want to keep the image of what the author looked like when a particular book was published. So here denormalizing is actually useful, because you get a snapshot in time unlike pure referencing
  • #41 DocumentDB offers SDKs and tooling to help you develop against and manage data in the service. All APIs are accessible as REST over HTTP, we also provide .Net, Node, Java and Python SDKs I already shown provisioning through the portal – the Azure Preview portal offers a variety of development, monitoring and management capabilities. DocumentDB Studio is an open source app that allows you to manage and interact with the service from a GUI tool The Data Migration tool allows you import existing data into DocumentDB