#mongodbdays Aggregation Framework Emily Stolfo Ruby Engineer/Evangelist, 10gen @EmStolfo Tuesday, January 29, 13
Agenda • State of Aggregation • Pipeline • Usage and Limitations • Optimization • Sharding • (Expressions) • Looking Ahead Tuesday, January 29, 13
State of Aggregation Tuesday, January 29, 13
State of Aggregation • We're storing our data in MongoDB • We need to do ad-hoc reporting, grouping, common aggregations, etc. • What are we using for this? Tuesday, January 29, 13
Data Warehousing Tuesday, January 29, 13
Data Warehousing • SQL for reporting and analytics • Infrastructure complications – Additional maintenance – Data duplication – ETL processes – Real time? Tuesday, January 29, 13
MapReduce Tuesday, January 29, 13
MapReduce • Extremely versatile, powerful • Intended for complex data analysis • Overkill for simple aggregation tasks, such as – Averages – Summation – Grouping Tuesday, January 29, 13
MapReduce in MongoDB • Implemented with JavaScript – Single-threaded – Difficult to debug • Concurrency – Appearance of parallelism – Write locks Tuesday, January 29, 13
Aggregation Framework Tuesday, January 29, 13
Aggregation Framework • Declared in JSON, executes in C++ • Flexible, functional, and simple – Operation pipeline – Computational expressions • Works well with sharding Tuesday, January 29, 13
Enabling Developers • Doing more within MongoDB, faster • Refactoring MapReduce and groupings – Replace pages of JavaScript – Longer aggregation pipelines • Quick aggregations from the shell Tuesday, January 29, 13
Pipeline Tuesday, January 29, 13
Pipeline • Process a stream of documents – Original input is a collection – Final output is a result document • Series of operators – Filter or transform data – Input/output chain ps ax | grep mongod | head -n 1 Tuesday, January 29, 13
Pipeline Operators • $match • $sort • $project • $limit • $group • $skip • $unwind Tuesday, January 29, 13
Example book data { _id: 375, title: "The Great Gatsby", ISBN: "9781857150193", available: true, pages: 218, chapters: 9, subjects: [ "Long Island", "New York", "1920s" ], language: "English" } Tuesday, January 29, 13
$match • Filter documents • Uses existing query syntax • (No geospatial operations or $where) Tuesday, January 29, 13
Matching Field Values { { $match: { title: "The Great Gatsby", language: "Russian" pages: 218, }} language: "English" } { title: "War and Peace", { pages: 1440, title: "War and Peace", language: "Russian" pages: 1440, } language: "Russian" } { title: "Atlas Shrugged", pages: 1088, language: "English" } Tuesday, January 29, 13
Matching with Query Operators { { $match: { title: "The Great Gatsby", pages: { $gt: 1000 } pages: 218, }} language: "English" } { { title: "War and Peace", title: "War and Peace", pages: 1440, pages: 1440, language: "Russian" language: "Russian" } } { { title: "Atlas Shrugged", title: "Atlas Shrugged", pages: 1088, pages: 1088, language: "English" language: "English" } } Tuesday, January 29, 13
$project • Reshape documents • Include, exclude or rename fields • Inject computed fields • Create sub-document fields Tuesday, January 29, 13
Including and Excluding Fields { { $project: { _id: 375, _id: 0, title: "Great Gatsby", title: 1, ISBN: "9781857150193", language: 1 available: true, }} pages: 218, subjects: [ "Long Island", "New York", "1920s" { ], title: " Great Gatsby", language: "English" language: "English" } } Tuesday, January 29, 13
Renaming and Computing Fields { { $project: { _id: 375, avgChapterLength: { title: "Great Gatsby", $divide: ["$pages", ISBN: "9781857150193", "$chapters"] available: true, }, pages: 218, lang: "$language" chapters: 9, }} subjects: [ "Long Island", "New York", "1920s" { ], _id: 375, language: "English" avgChapterLength: 24.2222 , } lang: "English" } Tuesday, January 29, 13
Creating Sub-Document Fields { $project: { { title: 1, _id: 375, stats: { title: "Great Gatsby", pages: "$pages", ISBN: "9781857150193", language: "$language", available: true, } pages: 218, }} subjects: [ "Long Island", "New York", "1920s" { ], _id: 375, language: "English" title: " Great Gatsby", } stats: { pages: 218, language: "English" } Tuesday, January 29, 13
$group • Group documents by an ID – Field reference, object, constant • Other output fields are computed – $max, $min, $avg, $sum – $addToSet, $push – $first, $last • Processes all data in memory Tuesday, January 29, 13
Calculating an Average { { $group: { title: "The Great Gatsby", _id: "$language", pages: 218, avgPages: { $avg: language: "English" "$pages" } } }} { title: "War and Peace", pages: 1440, { language: "Russian" _id: "Russian", } avgPages: 1440 } { title: "Atlas Shrugged", { pages: 1088, _id: "English", language: "English" avgPages: 653 } } Tuesday, January 29, 13
Summating Fields and Counting { { $group: { title: "The Great Gatsby", _id: "$language", pages: 218, numTitles: { $sum: 1 }, language: "English" sumPages: { $sum: "$pages" } }} } { title: "War and Peace", { pages: 1440, _id: "Russian", language: "Russian” numTitles: 1, } sumPages: 1440 } { { title: "Atlas Shrugged", _id: "English", pages: 1088, numTitles: 2, language: "English" sumPages: 1306 } } Tuesday, January 29, 13
Collecting Distinct Values { { $group: { title: "The Great Gatsby", _id: "$language", pages: 218, titles: { $addToSet: "$title" } language: "English" }} } { { title: "War and Peace", _id: "Russian", titles: [ "War and Peace" ] pages: 1440, } language: "Russian" } { _id: "English", { titles: [ title: "Atlas Shrugged", "Atlas Shrugged", pages: 1088, "The Great Gatsby" language: "English" ] } } Tuesday, January 29, 13
$unwind • Applied to an array field • Yield new documents for each array element – Array replaced by element value – Missing/empty fields → no output – Non-array fields → error • Pipe to $group to aggregate array values Tuesday, January 29, 13
Yielding Multiple Documents from One { { $unwind: "$subjects" } title: "The Great Gatsby", ISBN: "9781857150193", { subjects: [ title: "The Great Gatsby", "Long Island", ISBN: "9781857150193", "New York", subjects: "Long Island" "1920s" } ] } { title: "The Great Gatsby", ISBN: "9781857150193", subjects: "New York" } { title: "The Great Gatsby", ISBN: "9781857150193", subjects: "1920s" } Tuesday, January 29, 13
$sort, $limit, $skip • Sort documents by one or more fields – Same order syntax as cursors – Waits for earlier pipeline operator to return – In-memory unless early and indexed • Limit and skip follow cursor behavior Tuesday, January 29, 13
Sort All the Documents in the Pipeline { title: "The Great Gatsby" } { $sort: { title: 1 }} { title: "Brave New World" } { title: "Grapes of Wrath" } { title: "Animal Farm" } { title: "Animal Farm" } { title: "Brave New World" } { title: "Lord of the Flies" } { title: "Fahrenheit 451" } { title: "Fathers and Sons" } { title: "Fathers and Sons" } { title: "Invisible Man" } { title: "Grapes of Wrath" } { title: "Fahrenheit 451" } { title: "Invisible Man" } { title: "Lord of the Flies" } { title: "The Great Gatsby" } Tuesday, January 29, 13
Limit Documents Through the Pipeline { title: "The Great Gatsby" } { $limit: 5 } { title: "Brave New World" } { title: "Grapes of Wrath" } { title: "The Great Gatsby" } { title: "Animal Farm" } { title: "Brave New World" } { title: "Lord of the Flies" } { title: "Grapes of Wrath" } { title: "Fathers and Sons" } { title: "Animal Farm" } { title: "Invisible Man" } { title: "Lord of the Flies" } { title: "Fahrenheit 451" } Tuesday, January 29, 13
Skip Over Documents in the Pipeline { title: "The Great Gatsby" } { $skip: 5 } { title: "Brave New World" } { title: "Grapes of Wrath" } { title: "Animal Farm" } { title: "Fathers and Sons" } { title: "Lord of the Flies" } { title: "Invisible Man" } { title: "Fathers and Sons" } { title: "Fahrenheit 451" } { title: "Invisible Man" } { title: "Fahrenheit 451" } Tuesday, January 29, 13
Usage and Limitations Tuesday, January 29, 13
Usage • collection.aggregate() method – Mongo shell – Most drivers • aggregate database command Tuesday, January 29, 13
Collection db.books.aggregate([ { $project: { language: 1 }}, { $group: { _id: "$language", numTitles: { $sum: 1 }}} ]) { result: [ { _id: "Russian", numTitles: 1 }, { _id: "English", numTitles: 2 } ], ok: 1 } Tuesday, January 29, 13
Database Command db.runCommand({ aggregate: "books", pipeline: [ { $project: { language: 1 }}, { $group: { _id: "$language", numTitles: { $sum: 1 }}} ] }) { result: [ { _id: "Russian", numTitles: 1 }, { _id: "English", numTitles: 2 } ], ok: 1 } Tuesday, January 29, 13
Limitations • Result limited by BSON document size – Final command result – Intermediate shard results • Pipeline operator memory limits • Some BSON types unsupported – Binary, Code, deprecated types Tuesday, January 29, 13
Sharding Tuesday, January 29, 13
Sharding • Split the pipeline at first $group or $sort – Shards execute pipeline up to that point – mongos merges results and continues • Early $match may excuse shards • CPU and memory implications for mongos Tuesday, January 29, 13
Sharding [ { $match: { /* filter by shard key */ }}, { $project: { /* select fields */ }}, { $group: { /* group by some field */ }}, { $sort: { /* sort by some field */ }}, { $project: { /* reshape result */ }} ] Tuesday, January 29, 13
Aggregation in a sharded cluster Tuesday, January 29, 13
Expressions Tuesday, January 29, 13
Expressions • Return computed values • Used with $project and $group • Reference fields using $ (e.g. "$x") • Expressions may be nested Tuesday, January 29, 13
Boolean Operators • Input array of one or more values – $and, $or – Short-circuit logic • Invert values with $not • Evaluation of non-boolean types – null, undefined, zero ▶ false – Non-zero, strings, dates, objects ▶ true { $and: [true, false] } ▶ false { $or: ["foo", 0] } ▶ true { $not: null } ▶ true Tuesday, January 29, 13
Comparison Operators • Compare numbers, strings, and dates • Input array with two operands – $cmp, $eq, $ne – $gt, $gte, $lt, $lte { $cmp: [3, 4] } ▶ -1 { $eq: ["foo", "bar"] } ▶ false { $ne: ["foo", "bar"] } ▶ true { $gt: [9, 7] } ▶ true Tuesday, January 29, 13
Arithmetic Operators • Input array of one or more numbers – $add, $multiply • Input array of two operands – $subtract, $divide, $mod { $add: [1, 2, 3] } ▶ 6 { $multiply: [2, 2, 2] } ▶ 8 { $subtract: [10, 7] } ▶ 3 { $divide: [10, 2] } ▶ 5 { $mod: [8, 3] } ▶ 2 Tuesday, January 29, 13
String Operators • $strcasecmp case-insensitive comparison – $cmp is case-sensitive • $toLower and $toUpper case change • $substr for sub-string extraction • Not encoding aware (assumes ASCII alphabet) { $strcasecmp: ["foo", "bar"] } ▶ 1 { $substr: ["foo", 1, 2] } ▶ "oo" { $toUpper: "foo" } ▶ "FOO" { $toLower: "BAR" } ▶ "bar" Tuesday, January 29, 13
Date Operators • Extract values from date objects – $dayOfYear, $dayOfMonth, $dayOfWeek – $year, $month, $week – $hour, $minute, $second { $year: ISODate("2012-10-24T00:00:00.000Z") } ▶ 2012 { $month: ISODate("2012-10-24T00:00:00.000Z") } ▶ 10 { $dayOfMonth: ISODate("2012-10-24T00:00:00.000Z") } ▶ 24 { $dayOfWeek: ISODate("2012-10-24T00:00:00.000Z") } ▶ 4 { $dayOfYear: ISODate("2012-10-24T00:00:00.000Z") } ▶ 299 { $week: ISODate("2012-10-24T00:00:00.000Z") } ▶ 43 Tuesday, January 29, 13
Conditional Operators • $cond ternary operator • $ifNull { $cond: [{ $eq: [1, 2] }, "same", "different"] } ▶ "different” { $ifNull: ["foo", "bar"] } ▶ "foo" { $ifNull: [null, "bar"] } ▶ "bar" Tuesday, January 29, 13
Looking Ahead Tuesday, January 29, 13
Framework Use Cases • Basic aggregation queries • Ad-hoc reporting • Real-time analytics • Visualizing time series data Tuesday, January 29, 13
Extending the Framework • Adding new pipeline operators, expressions • $out and $tee for output control – https://jira.mongodb.org/browse/SERVER-3253 Tuesday, January 29, 13
Future Enhancements • Automatically move $match earlier if possible • Pipeline explain facility • Memory usage improvements – Grouping input sorted by _id – Sorting with limited output Tuesday, January 29, 13
#mongodbdays Thank You Emily Stolfo Ruby Engineer/Evangelist, 10gen @EmStolfo Tuesday, January 29, 13

Aggregation Framework

  • 1.
    #mongodbdays Aggregation Framework Emily Stolfo Ruby Engineer/Evangelist, 10gen @EmStolfo Tuesday, January 29, 13
  • 2.
    Agenda • State of Aggregation • Pipeline • Usage and Limitations • Optimization • Sharding • (Expressions) • Looking Ahead Tuesday, January 29, 13
  • 3.
  • 4.
    State of Aggregation • We're storing our data in MongoDB • We need to do ad-hoc reporting, grouping, common aggregations, etc. • What are we using for this? Tuesday, January 29, 13
  • 5.
  • 6.
    Data Warehousing • SQL for reporting and analytics • Infrastructure complications – Additional maintenance – Data duplication – ETL processes – Real time? Tuesday, January 29, 13
  • 7.
  • 8.
    MapReduce • Extremely versatile, powerful • Intended for complex data analysis • Overkill for simple aggregation tasks, such as – Averages – Summation – Grouping Tuesday, January 29, 13
  • 9.
    MapReduce in MongoDB • Implemented with JavaScript – Single-threaded – Difficult to debug • Concurrency – Appearance of parallelism – Write locks Tuesday, January 29, 13
  • 10.
  • 11.
    Aggregation Framework • Declared in JSON, executes in C++ • Flexible, functional, and simple – Operation pipeline – Computational expressions • Works well with sharding Tuesday, January 29, 13
  • 12.
    Enabling Developers • Doing more within MongoDB, faster • Refactoring MapReduce and groupings – Replace pages of JavaScript – Longer aggregation pipelines • Quick aggregations from the shell Tuesday, January 29, 13
  • 13.
  • 14.
    Pipeline • Process a stream of documents – Original input is a collection – Final output is a result document • Series of operators – Filter or transform data – Input/output chain ps ax | grep mongod | head -n 1 Tuesday, January 29, 13
  • 15.
    Pipeline Operators • $match • $sort • $project • $limit • $group • $skip • $unwind Tuesday, January 29, 13
  • 16.
    Example book data { _id: 375, title: "The Great Gatsby", ISBN: "9781857150193", available: true, pages: 218, chapters: 9, subjects: [ "Long Island", "New York", "1920s" ], language: "English" } Tuesday, January 29, 13
  • 17.
    $match • Filter documents • Uses existing query syntax • (No geospatial operations or $where) Tuesday, January 29, 13
  • 18.
    Matching Field Values { { $match: { title: "The Great Gatsby", language: "Russian" pages: 218, }} language: "English" } { title: "War and Peace", { pages: 1440, title: "War and Peace", language: "Russian" pages: 1440, } language: "Russian" } { title: "Atlas Shrugged", pages: 1088, language: "English" } Tuesday, January 29, 13
  • 19.
    Matching with QueryOperators { { $match: { title: "The Great Gatsby", pages: { $gt: 1000 } pages: 218, }} language: "English" } { { title: "War and Peace", title: "War and Peace", pages: 1440, pages: 1440, language: "Russian" language: "Russian" } } { { title: "Atlas Shrugged", title: "Atlas Shrugged", pages: 1088, pages: 1088, language: "English" language: "English" } } Tuesday, January 29, 13
  • 20.
    $project • Reshape documents • Include, exclude or rename fields • Inject computed fields • Create sub-document fields Tuesday, January 29, 13
  • 21.
    Including and ExcludingFields { { $project: { _id: 375, _id: 0, title: "Great Gatsby", title: 1, ISBN: "9781857150193", language: 1 available: true, }} pages: 218, subjects: [ "Long Island", "New York", "1920s" { ], title: " Great Gatsby", language: "English" language: "English" } } Tuesday, January 29, 13
  • 22.
    Renaming and ComputingFields { { $project: { _id: 375, avgChapterLength: { title: "Great Gatsby", $divide: ["$pages", ISBN: "9781857150193", "$chapters"] available: true, }, pages: 218, lang: "$language" chapters: 9, }} subjects: [ "Long Island", "New York", "1920s" { ], _id: 375, language: "English" avgChapterLength: 24.2222 , } lang: "English" } Tuesday, January 29, 13
  • 23.
    Creating Sub-Document Fields { $project: { { title: 1, _id: 375, stats: { title: "Great Gatsby", pages: "$pages", ISBN: "9781857150193", language: "$language", available: true, } pages: 218, }} subjects: [ "Long Island", "New York", "1920s" { ], _id: 375, language: "English" title: " Great Gatsby", } stats: { pages: 218, language: "English" } Tuesday, January 29, 13
  • 24.
    $group • Group documents by an ID – Field reference, object, constant • Other output fields are computed – $max, $min, $avg, $sum – $addToSet, $push – $first, $last • Processes all data in memory Tuesday, January 29, 13
  • 25.
    Calculating an Average { { $group: { title: "The Great Gatsby", _id: "$language", pages: 218, avgPages: { $avg: language: "English" "$pages" } } }} { title: "War and Peace", pages: 1440, { language: "Russian" _id: "Russian", } avgPages: 1440 } { title: "Atlas Shrugged", { pages: 1088, _id: "English", language: "English" avgPages: 653 } } Tuesday, January 29, 13
  • 26.
    Summating Fields andCounting { { $group: { title: "The Great Gatsby", _id: "$language", pages: 218, numTitles: { $sum: 1 }, language: "English" sumPages: { $sum: "$pages" } }} } { title: "War and Peace", { pages: 1440, _id: "Russian", language: "Russian” numTitles: 1, } sumPages: 1440 } { { title: "Atlas Shrugged", _id: "English", pages: 1088, numTitles: 2, language: "English" sumPages: 1306 } } Tuesday, January 29, 13
  • 27.
    Collecting Distinct Values { { $group: { title: "The Great Gatsby", _id: "$language", pages: 218, titles: { $addToSet: "$title" } language: "English" }} } { { title: "War and Peace", _id: "Russian", titles: [ "War and Peace" ] pages: 1440, } language: "Russian" } { _id: "English", { titles: [ title: "Atlas Shrugged", "Atlas Shrugged", pages: 1088, "The Great Gatsby" language: "English" ] } } Tuesday, January 29, 13
  • 28.
    $unwind • Applied to an array field • Yield new documents for each array element – Array replaced by element value – Missing/empty fields → no output – Non-array fields → error • Pipe to $group to aggregate array values Tuesday, January 29, 13
  • 29.
    Yielding Multiple Documentsfrom One { { $unwind: "$subjects" } title: "The Great Gatsby", ISBN: "9781857150193", { subjects: [ title: "The Great Gatsby", "Long Island", ISBN: "9781857150193", "New York", subjects: "Long Island" "1920s" } ] } { title: "The Great Gatsby", ISBN: "9781857150193", subjects: "New York" } { title: "The Great Gatsby", ISBN: "9781857150193", subjects: "1920s" } Tuesday, January 29, 13
  • 30.
    $sort, $limit, $skip • Sort documents by one or more fields – Same order syntax as cursors – Waits for earlier pipeline operator to return – In-memory unless early and indexed • Limit and skip follow cursor behavior Tuesday, January 29, 13
  • 31.
    Sort All theDocuments in the Pipeline { title: "The Great Gatsby" } { $sort: { title: 1 }} { title: "Brave New World" } { title: "Grapes of Wrath" } { title: "Animal Farm" } { title: "Animal Farm" } { title: "Brave New World" } { title: "Lord of the Flies" } { title: "Fahrenheit 451" } { title: "Fathers and Sons" } { title: "Fathers and Sons" } { title: "Invisible Man" } { title: "Grapes of Wrath" } { title: "Fahrenheit 451" } { title: "Invisible Man" } { title: "Lord of the Flies" } { title: "The Great Gatsby" } Tuesday, January 29, 13
  • 32.
    Limit Documents Throughthe Pipeline { title: "The Great Gatsby" } { $limit: 5 } { title: "Brave New World" } { title: "Grapes of Wrath" } { title: "The Great Gatsby" } { title: "Animal Farm" } { title: "Brave New World" } { title: "Lord of the Flies" } { title: "Grapes of Wrath" } { title: "Fathers and Sons" } { title: "Animal Farm" } { title: "Invisible Man" } { title: "Lord of the Flies" } { title: "Fahrenheit 451" } Tuesday, January 29, 13
  • 33.
    Skip Over Documentsin the Pipeline { title: "The Great Gatsby" } { $skip: 5 } { title: "Brave New World" } { title: "Grapes of Wrath" } { title: "Animal Farm" } { title: "Fathers and Sons" } { title: "Lord of the Flies" } { title: "Invisible Man" } { title: "Fathers and Sons" } { title: "Fahrenheit 451" } { title: "Invisible Man" } { title: "Fahrenheit 451" } Tuesday, January 29, 13
  • 34.
  • 35.
    Usage • collection.aggregate() method – Mongo shell – Most drivers • aggregate database command Tuesday, January 29, 13
  • 36.
    Collection db.books.aggregate([ { $project: { language: 1 }}, { $group: { _id: "$language", numTitles: { $sum: 1 }}} ]) { result: [ { _id: "Russian", numTitles: 1 }, { _id: "English", numTitles: 2 } ], ok: 1 } Tuesday, January 29, 13
  • 37.
    Database Command db.runCommand({ aggregate: "books", pipeline: [ { $project: { language: 1 }}, { $group: { _id: "$language", numTitles: { $sum: 1 }}} ] }) { result: [ { _id: "Russian", numTitles: 1 }, { _id: "English", numTitles: 2 } ], ok: 1 } Tuesday, January 29, 13
  • 38.
    Limitations • Result limited by BSON document size – Final command result – Intermediate shard results • Pipeline operator memory limits • Some BSON types unsupported – Binary, Code, deprecated types Tuesday, January 29, 13
  • 39.
  • 40.
    Sharding • Split the pipeline at first $group or $sort – Shards execute pipeline up to that point – mongos merges results and continues • Early $match may excuse shards • CPU and memory implications for mongos Tuesday, January 29, 13
  • 41.
    Sharding [ { $match: { /* filter by shard key */ }}, { $project: { /* select fields */ }}, { $group: { /* group by some field */ }}, { $sort: { /* sort by some field */ }}, { $project: { /* reshape result */ }} ] Tuesday, January 29, 13
  • 42.
    Aggregation in asharded cluster Tuesday, January 29, 13
  • 43.
  • 44.
    Expressions • Return computed values • Used with $project and $group • Reference fields using $ (e.g. "$x") • Expressions may be nested Tuesday, January 29, 13
  • 45.
    Boolean Operators • Input array of one or more values – $and, $or – Short-circuit logic • Invert values with $not • Evaluation of non-boolean types – null, undefined, zero ▶ false – Non-zero, strings, dates, objects ▶ true { $and: [true, false] } ▶ false { $or: ["foo", 0] } ▶ true { $not: null } ▶ true Tuesday, January 29, 13
  • 46.
    Comparison Operators • Compare numbers, strings, and dates • Input array with two operands – $cmp, $eq, $ne – $gt, $gte, $lt, $lte { $cmp: [3, 4] } ▶ -1 { $eq: ["foo", "bar"] } ▶ false { $ne: ["foo", "bar"] } ▶ true { $gt: [9, 7] } ▶ true Tuesday, January 29, 13
  • 47.
    Arithmetic Operators • Input array of one or more numbers – $add, $multiply • Input array of two operands – $subtract, $divide, $mod { $add: [1, 2, 3] } ▶ 6 { $multiply: [2, 2, 2] } ▶ 8 { $subtract: [10, 7] } ▶ 3 { $divide: [10, 2] } ▶ 5 { $mod: [8, 3] } ▶ 2 Tuesday, January 29, 13
  • 48.
    String Operators • $strcasecmp case-insensitive comparison – $cmp is case-sensitive • $toLower and $toUpper case change • $substr for sub-string extraction • Not encoding aware (assumes ASCII alphabet) { $strcasecmp: ["foo", "bar"] } ▶ 1 { $substr: ["foo", 1, 2] } ▶ "oo" { $toUpper: "foo" } ▶ "FOO" { $toLower: "BAR" } ▶ "bar" Tuesday, January 29, 13
  • 49.
    Date Operators • Extract values from date objects – $dayOfYear, $dayOfMonth, $dayOfWeek – $year, $month, $week – $hour, $minute, $second { $year: ISODate("2012-10-24T00:00:00.000Z") } ▶ 2012 { $month: ISODate("2012-10-24T00:00:00.000Z") } ▶ 10 { $dayOfMonth: ISODate("2012-10-24T00:00:00.000Z") } ▶ 24 { $dayOfWeek: ISODate("2012-10-24T00:00:00.000Z") } ▶ 4 { $dayOfYear: ISODate("2012-10-24T00:00:00.000Z") } ▶ 299 { $week: ISODate("2012-10-24T00:00:00.000Z") } ▶ 43 Tuesday, January 29, 13
  • 50.
    Conditional Operators • $cond ternary operator • $ifNull { $cond: [{ $eq: [1, 2] }, "same", "different"] } ▶ "different” { $ifNull: ["foo", "bar"] } ▶ "foo" { $ifNull: [null, "bar"] } ▶ "bar" Tuesday, January 29, 13
  • 51.
  • 52.
    Framework Use Cases • Basic aggregation queries • Ad-hoc reporting • Real-time analytics • Visualizing time series data Tuesday, January 29, 13
  • 53.
    Extending the Framework • Adding new pipeline operators, expressions • $out and $tee for output control – https://jira.mongodb.org/browse/SERVER-3253 Tuesday, January 29, 13
  • 54.
    Future Enhancements • Automatically move $match earlier if possible • Pipeline explain facility • Memory usage improvements – Grouping input sorted by _id – Sorting with limited output Tuesday, January 29, 13
  • 55.
    #mongodbdays Thank You Emily Stolfo Ruby Engineer/Evangelist, 10gen @EmStolfo Tuesday, January 29, 13