Storing tree structures with MongoDB
Introduction In a real life almost any project deals with the tree structures. Different kinds of taxonomies, site structures etc require modeling of hierarchy relations. Typical approaches used ● Model Tree Structures with Child References ● Model Tree Structures with Parent References ● Model Tree Structures with an Array of Ancestors ● Model Tree Structures with Materialized Paths ● Model Tree Structures with Nested Sets
Demo dataset used
Challenges to address In a typical site scenario, we should be able to ● Operate with tree (insert new node under specific parent, update/remove existing node, move node across the tree) ● Get path to node (for example, in order to be build the breadcrumb section) ● Get all node descendants (in order to be able, for example, to select goods from more general category, like 'Cell Phones and Accessories' which should include goods from all subcategories.
Scope of the demo On each of the examples below we: ● Add new node called 'LG' under electronics ● Move 'LG' node under Cell_Phones_And_Smartphones node ● Remove 'LG' node from the tree ● Get child nodes of Electronics node ● Get path to 'Nokia' node ● Get all descendants of the 'Cell_Phones_and_Accessories' node
Let's start...
Tree structure with parent reference This is most commonly used approach. For each node we store (ID, ParentReference, Order)
Operating with tree Pretty simple, but changing the position of the node within siblings will require additional calculations. You might want to set high numbers like item position * 10^6 for sorting in order to be able to set new node order as trunc (lower sibling order - higher sibling order)/2 - this will give you enough operations, until you will need to traverse whole the tree and set the order defaults to big numbers again
Adding new node Good points: requires only one insert operation to introduce the node. var existingelemscount = db.categoriesPCO.find ({parent:'Electronics'}).count(); var neworder = (existingelemscount+1)*10; db.categoriesPCO.insert({_id:'LG', parent:'Electronics', someadditionalattr:'test', order:neworder}) //{ "_id" : "LG", "parent" : "Electronics", // "someadditionalattr" : "test", "order" : 40 }
Updating / moving the node Good points: as during insert - requires only one update operation to amend the node existingelemscount = db.categoriesPCO.find ({parent:'Cell_Phones_and_Smartphones'}).count(); neworder = (existingelemscount+1)*10; db.categoriesPCO.update({_id:'LG'},{$set: {parent:'Cell_Phones_and_Smartphones', order:neworder}}); //{ "_id" : "LG", "order" : 60, "parent" : // "Cell_Phones_and_Smartphones", "someadditionalattr" : "test" }
Node removal Good points: requires single operation to remove the node from tree db.categoriesPCO.remove({_id:'LG'});
Getting node children, ordered Good points: all childs can be retrieved from database and ordered using single call. db.categoriesPCO.find({$query:{parent:'Electronics'}, $orderby:{order:1}}) //{ "_id" : "Cameras_and_Photography", "parent" : "Electronics", "order" : 10 } //{ "_id" : "Shop_Top_Products", "parent" : "Electronics", "order" : 20 } //{ "_id" : "Cell_Phones_and_Accessories", "parent" : "Electronics", "order" : 30 }
Getting all node descendants Bad points: unfortunately, requires recursive calls to database. var descendants=[] var stack=[]; var item = db.categoriesPCO.findOne({_id:"Cell_Phones_and_Accessories"}); stack.push(item); while (stack.length>0){ var currentnode = stack.pop(); var children = db.categoriesPCO.find({parent:currentnode._id}); while(true === children.hasNext()) { var child = children.next(); descendants.push(child._id); stack.push(child); } } descendants.join(",") //Cell_Phones_and_Smartphones,Headsets,Batteries,Cables_And_Adapters,Nokia, Samsung,Apple,HTC,Vyacheslav
Getting path to node Bad points: unfortunately also require recursive operations to get the path. var path=[] var item = db.categoriesPCO.findOne({_id:"Nokia"}) while (item.parent !== null) { item=db.categoriesPCO.findOne({_id:item.parent}); path.push(item._id); } path.reverse().join(' / '); //Electronics / Cell_Phones_and_Accessories / Cell_Phones_and_Smartphones
Indexes Recommended index is on fields parent and order db.categoriesPCO.ensureIndex( { parent: 1, order:1 } )
Tree structure with childs reference For each node we store (ID, ChildReferences).
Note Please note, that in this case we do not need order field, because Childs collection already provides this information. Most of languages respect the array order. If this is not in case for your language, you might consider additional coding to preserve order, however this will make things more complicated
Adding new node Note: requires one insert operation and one update operation to insert the node. db.categoriesCRO.insert({_id:'LG', childs:[]}); db.categoriesCRO.update({_id:'Electronics'},{ $addToSet: {childs:'LG'}}); //{ "_id" : "Electronics", "childs" : [ "Cameras_and_Photography", "Shop_Top_Products", "Cell_Phones_and_Accessories", "LG" ] }
Updating/moving the node Requires single update operation to change node order within same parent, requires two update operations, if node is moved under another parent. Rearranging order under the same parent db.categoriesCRO.update({_id:'Electronics'},{$set:{"childs.1":'LG'," childs.3":'Shop_Top_Products'}}); //{ "_id" : "Electronics", "childs" : [ "Cameras_and_Photography", "LG", "Cell_Phones_and_Accessories", "Shop_Top_Products" ] } Moving the node db.categoriesCRO.update({_id:'Cell_Phones_and_Smartphones'},{ $addToSet: {childs:'LG'}}); db.categoriesCRO.update({_id:'Electronics'},{$pull:{childs:'LG'}}); //{ "_id" : "Cell_Phones_and_Smartphones", "childs" : [ "Nokia", "Samsung", "Apple", "HTC", "Vyacheslav", "LG" ] }
Node removal Node removal also requires two operations: one update and one remove. db.categoriesCRO.update ({_id:'Cell_Phones_and_Smartphones'},{$pull: {childs:'LG'}}) db.categoriesCRO.remove({_id:'LG'});
Getting node children, ordered Bad points: requires additional client side sorting by parent array sequence. Depending on result set, it may affect speed of your code. var parent = db.categoriesCRO.findOne({_id:'Electronics'}) db.categoriesCRO.find({_id:{$in:parent.childs}})
Getting node children, ordered Result set { "_id" : "Cameras_and_Photography", "childs" : [ "Digital_Cameras", "Camcorders", "Lenses_and_Filters", "Tripods_and_supports", "Lighting_and_studio" ] } { "_id" : "Cell_Phones_and_Accessories", "childs" : [ "Cell_Phones_and_Smartphones", "Headsets", "Batteries", "Cables_And_Adapters" ] } { "_id" : "Shop_Top_Products", "childs" : [ "IPad", "IPhone", "IPod", "Blackberry" ] } //parent: { "_id" : "Electronics", "childs" : [ "Cameras_and_Photography", "Cell_Phones_and_Accessories", "Shop_Top_Products" ] } As you see, we have ordered array childs, which can be used to sort the result set on a client
Getting all node descendants Note: also recursive operations, but we need less selects to databases comparing to previous approach var descendants=[] var stack=[]; var item = db.categoriesCRO.findOne({_id:"Cell_Phones_and_Accessories"}); stack.push(item); while (stack.length>0){ var currentnode = stack.pop(); var children = db.categoriesCRO.find({_id:{$in:currentnode.childs}}); while(true === children.hasNext()) { var child = children.next(); descendants.push(child._id); if(child.childs.length>0){ stack.push(child); } } } //Batteries,Cables_And_Adapters,Cell_Phones_and_Smartphones,Headsets,Apple,HTC,Nokia, Samsung descendants.join(",")
Getting path to node Path is calculated recursively, so we need to issue number of sequential calls to database. var path=[] var item = db.categoriesCRO.findOne({_id:"Nokia"}) while ((item=db.categoriesCRO.findOne({childs:item._id}))) { path.push(item._id); } path.reverse().join(' / '); //Electronics / Cell_Phones_and_Accessories / Cell_Phones_and_Smartphones
Indexes Recommended action is putting index on childs: db.categoriesCRO.ensureIndex( { childs: 1 } )
Tree structure using an Array of Ancestors For each node we store (ID, ParentReference, AncestorReferences)
Adding new node You need one insert operation to introduce new node, however you need to invoke select in order to prepare the data for insert var ancestorpath = db.categoriesAAO.findOne ({_id:'Electronics'}).ancestors; ancestorpath.push('Electronics') db.categoriesAAO.insert({_id:'LG', parent:'Electronics', ancestors:ancestorpath}); //{ "_id" : "LG", "parent" : "Electronics", "ancestors" : [ "Electronics" ] }
Updating/moving the node moving the node requires one select and one update operation ancestorpath = db.categoriesAAO.findOne ({_id:'Cell_Phones_and_Smartphones'}).ancestors; ancestorpath.push('Cell_Phones_and_Smartphones') db.categoriesAAO.update({_id:'LG'},{$set: {parent:'Cell_Phones_and_Smartphones', ancestors: ancestorpath}}); //{ "_id" : "LG", "ancestors" : [ "Electronics", "Cell_Phones_and_Accessories", "Cell_Phones_and_Smartphones" ], "parent" : "Cell_Phones_and_Smartphones" }
Node removal is done with single operation db.categoriesAAO.remove({_id:'LG'});
Getting node children, unordered Note: unless you introduce the order field, it is impossible to get ordered list of node children. You should consider another approach if you need order. db.categoriesAAO.find({$query:{parent:'Electronics'}})
Getting all node descendants There are two options to get all node descendants. One is classic through recursion: var ancestors = db.categoriesAAO.find({ancestors:" Cell_Phones_and_Accessories"},{_id:1}); while(true === ancestors.hasNext()) { var elem = ancestors.next(); descendants.push(elem._id); } descendants.join(",") //Cell_Phones_and_Smartphones,Headsets,Batteries,Cables_And_Adapters,Nokia, Samsung,Apple,HTC,Vyacheslav
Getting all node descendants second is using aggregation framework introduced in MongoDB 2.2: var aggrancestors = db.categoriesAAO.aggregate([ {$match:{ancestors:"Cell_Phones_and_Accessories"}}, {$project:{_id:1}}, {$group:{_id:{},ancestors:{$addToSet:"$_id"}}} ]) descendants = aggrancestors.result[0].ancestors descendants.join(",") //Vyacheslav,HTC,Samsung,Cables_And_Adapters,Batteries,Headsets,Apple, Nokia,Cell_Phones_and_Smartphones
Getting path to node This operation is done with single call to database, which is advantage of this approach. var path=[] var item = db.categoriesAAO.findOne({_id:"Nokia"}) item path=item.ancestors; path.join(' / '); //Electronics / Cell_Phones_and_Accessories / Cell_Phones_and_Smartphones
Indexes Recommended index is putting index on ancestors: db.categoriesAAO.ensureIndex( { ancestors: 1 } )
Tree structure using Materialized Path For each node we store (ID, PathToNode)
Intro Approach looks similar to storing array of ancestors, but we store a path in form of string instead. In example above I intentionally use comma(,) as a path elements divider, in order to keep regular expression simpler
Adding new node New node insertion is done with one select and one insert operation var ancestorpath = db.categoriesMP.findOne ({_id:'Electronics'}).path; ancestorpath += 'Electronics,' db.categoriesMP.insert({_id:'LG', path:ancestorpath}); //{ "_id" : "LG", "path" : "Electronics," }
Updating/moving the node Node can be moved using one select and one update operation ancestorpath = db.categoriesMP.findOne ({_id:'Cell_Phones_and_Smartphones'}).path; ancestorpath +='Cell_Phones_and_Smartphones,' db.categoriesMP.update({_id:'LG'},{$set:{path:ancestorpath}}); //{ "_id" : "LG", "path" : "Electronics,Cell_Phones_and_Accessories, Cell_Phones_and_Smartphones," }
Node removal Node can be removed using single database query db.categoriesMP.remove({_id:'LG'});
Getting node children, unordered Note: unless you introduce the order field, it is impossible to get ordered list of node children. You should consider another approach if you need order. db.categoriesMP.find({$query:{path:'Electronics,'}}) //{ "_id" : "Cameras_and_Photography", "path" : "Electronics," } //{ "_id" : "Shop_Top_Products", "path" : "Electronics," } //{ "_id" : "Cell_Phones_and_Accessories", "path" : "Electronics," }
Getting all node descendants Single select, regexp starts with ^ which allows using the index for matching var descendants=[] var item = db.categoriesMP.findOne({_id:"Cell_Phones_and_Accessories"}); var criteria = '^'+item.path+item._id+','; var children = db.categoriesMP.find({path: { $regex: criteria, $options: 'i' }}); while(true === children.hasNext()) { var child = children.next(); descendants.push(child._id); } descendants.join(",") //Cell_Phones_and_Smartphones,Headsets,Batteries,Cables_And_Adapters,Nokia,Samsung, Apple,HTC,Vyacheslav
Getting path to node We can obtain path directly from node without issuing additional selects. var path=[] var item = db.categoriesMP.findOne({_id:"Nokia"}) print (item.path) //Electronics,Cell_Phones_and_Accessories, Cell_Phones_and_Smartphones,
Indexes Recommended index is putting index on path db.categoriesAAO.ensureIndex( { path: 1 } )
Tree structure using Nested Sets For each node we store (ID, left, right).
Adding new node Please refer to image above. Assume, we want to insert LG node after shop_top_products(14,23). New node would have left value of 24, affecting all remaining left values according to traversal rules, and will have right value of 25, affecting all remaining right values including root one.
Adding new node Take next node in traversal tree New node will have left value of the following sibling and right value - incremented by two following sibling's left one Now we have to create the place for the new node. Update affects right values of all ancestor nodes and also affects all nodes that remain for traversal Only after creating place new node can be inserted
Adding new node var followingsibling = db.categoriesNSO.findOne({_id:" Cell_Phones_and_Accessories"}); var newnode = {_id:'LG', left:followingsibling.left,right: followingsibling.left+1} db.categoriesNSO.update({right:{$gt:followingsibling.right}},{$inc:{right: 2}}, false, true) db.categoriesNSO.update({left:{$gte:followingsibling.left}, right:{$lte: followingsibling.right}},{$inc:{left:2, right:2}}, false, true) db.categoriesNSO.insert(newnode)
Check the result +-Electronics (1,46) +---Cameras_and_Photography (2,13) +------Digital_Cameras (3,4) +------Camcorders (5,6) +------Lenses_and_Filters (7,8) +------Tripods_and_supports (9,10) +------Lighting_and_studio (11,12) +----Shop_Top_Products (14,23) +------IPad (15,16) +------IPhone (17,18) +------IPod (19,20) +------Blackberry (21,22) +----LG (24,25) +----Cell_Phones_and_Accessories (26,45) +------Cell_Phones_and_Smartphones (27,38) +---------Nokia (28,29) +---------Samsung (30,31) +---------Apple (32,33) +---------HTC (34,35) +---------Vyacheslav (36,37) +-------Headsets (39,40) +-------Batteries (41,42) +-------Cables_And_Adapters (43,44)
Node removal While potentially rearranging node order within same parent is identical to exchanging node's left and right values,the formal way of moving the node is first removing node from the tree and later inserting it to new location. Note: node removal without removing it's childs is out of scope for this article. For now, we assume, that node to remove has no children, i.e. right-left=1 Steps are identical to adding the node - i.e. we adjusting the space by decreasing affected left/right values, and removing original node.
Node removal var nodetoremove = db.categoriesNSO.findOne({_id:"LG"}); if((nodetoremove.right-nodetoremove.left-1)>0.001) { print("Only node without childs can be removed") exit } var followingsibling = db.categoriesNSO.findOne({_id:"Cell_Phones_and_Accessories"}); //update all remaining nodes db.categoriesNSO.update({right:{$gt:nodetoremove.right}},{$inc:{right:-2}}, false, true) db.categoriesNSO.update({left:{$gt:nodetoremove.right}},{$inc:{left:-2}}, false, true) db.categoriesNSO.remove({_id:"LG"});
Updating/moving the single node Moving the node can be within same parent, or to another parent. If the same parent, and nodes are without childs, than you need just to exchange nodes (left,right) pairs. Formal way is to remove node and insert to new destination, thus the same restriction apply - only node without children can be moved. If you need to move subtree, consider creating mirror of the existing parent under new location, and move nodes under the new parent one by one. Once all nodes moved, remove obsolete old parent. As an example, lets move LG node from the insertion example under the Cell_Phones_and_Smartphones node, as a last sibling (i.e. you do not have following sibling node as in the insertion example)
Updating/moving the single node Steps 1. to remove LG node from tree using node removal procedure described above 2. to take right value of the new parent.New node will have left value of the parent's right value and right value - incremented by one parent's right one. Now we have to create the place for the new node: update affects right values of all nodes on a further traversal path var newparent = db.categoriesNSO.findOne({_id:"Cell_Phones_and_Smartphones"}); var nodetomove = {_id:'LG', left:newparent.right,right:newparent.right+1} //3th and 4th parameters: false stands for upsert=false and true stands for multi=true db.categoriesNSO.update({right:{$gte:newparent.right}},{$inc:{right:2}}, false, true) db.categoriesNSO.update({left:{$gte:newparent.right}},{$inc:{left:2}}, false, true) db.categoriesNSO.insert(nodetomove)
Check the result +-Electronics (1,46) +--Cameras_and_Photography (2,13) +-----Digital_Cameras (3,4) +-----Camcorders (5,6) +-----Lenses_and_Filters (7,8) +-----Tripods_and_supports (9,10) +-----Lighting_and_studio (11,12) +---Shop_Top_Products (14,23) +-----IPad (15,16) +-----IPhone (17,18) +-----IPod (19,20) +-----Blackberry (21,22) +---Cell_Phones_and_Accessories (24,45) +-----Cell_Phones_and_Smartphones (25,38) +---------Nokia (26,27) +---------Samsung (28,29) +---------Apple (30,31) +---------HTC (32,33) +---------Vyacheslav (34,35) +---------LG (36,37) +-------Headsets (39,40) +-------Batteries (41,42) +-------Cables_And_Adapters (43,44)
Getting all node descendants This is core strength of this approach - all descendants retrieved using one select to DB. Moreover,by sorting by node left - the dataset is ready for traversal in a correct order var descendants=[] var item = db.categoriesNSO.findOne({_id:"Cell_Phones_and_Accessories"}); print ('('+item.left+','+item.right+')') var children = db.categoriesNSO.find({left:{$gt:item.left}, right:{$lt: item.right}}).sort(left:1); while(true === children.hasNext()) { var child = children.next(); descendants.push(child._id); } descendants.join(",") //Cell_Phones_and_Smartphones,Headsets,Batteries,Cables_And_Adapters,Nokia, Samsung,Apple,HTC,Vyacheslav
Getting path to node Retrieving path to node is also elegant and can be done using single query to database: var path=[] var item = db.categoriesNSO.findOne({_id:"Nokia"}) var ancestors = db.categoriesNSO.find({left:{$lt:item.left}, right:{$gt: item.right}}).sort({left:1}) while(true === ancestors.hasNext()) { var child = ancestors.next(); path.push(child._id); } path.join('/') // Electronics/Cell_Phones_and_Accessories/Cell_Phones_and_Smartphones
Indexes Recommended index is putting index on left and right values: db.categoriesAAO.ensureIndex( { left: 1, right:1 } )
Combination of Nested Sets and classic Parent reference with order approach For each node we store (ID, Parent, Order,left, right).
Intro Left field also is treated as an order field, so we could omit order field. But from other hand, we can leave it, so we can use Parent Reference with order data to reconstruct left/right values in case of accidental corruption, or, for example during initial import.
Adding new node Adding new node can be adopted from Nested Sets in this manner: var followingsibling = db.categoriesNSO.findOne({_id:"Cell_Phones_and_Accessories"}); var previoussignling = db.categoriesNSO.findOne({_id:"Shop_Top_Products"}); var neworder = parseInt((followingsibling.order + previoussignling.order)/2); var newnode = {_id:'LG', left:followingsibling.left,right:followingsibling.left+1, parent:followingsibling.parent, order:neworder}; db.categoriesNSO.update({right:{$gt:followingsibling.right}},{$inc:{right:2}}, false, true) db.categoriesNSO.update({left:{$gte:followingsibling.left}, right:{$lte: followingsibling.right}},{$inc:{left:2, right:2}}, false, true) db.categoriesNSO.insert(newnode)
Check the result Before insertion +----Cameras_and_Photography (2,13) ord.[10] +-----Shop_Top_Products (14,23) ord.[20] +-----Cell_Phones_and_Accessories (26,45) ord.[30] After insertion +--Electronics (1,46) +----Cameras_and_Photography (2,13) ord.[10] +-------Digital_Cameras (3,4) ord.[10] +-------Camcorders (5,6) ord.[20] +-------Lenses_and_Filters (7,8) ord.[30] +-------Tripods_and_supports (9,10) ord.[40] +-------Lighting_and_studio (11,12) ord.[50] +-----Shop_Top_Products (14,23) ord.[20] +-------IPad (15,16) ord.[10] +-------IPhone (17,18) ord.[20] +-------IPod (19,20) ord.[30] +-------Blackberry (21,22) ord.[40] +-----LG (24,25) ord.[25] +-----Cell_Phones_and_Accessories (26,45) ord.[30] +-------Cell_Phones_and_Smartphones (27,38) ord.[10] +----------Nokia (28,29) ord.[10] +----------Samsung (30,31) ord.[20] +----------Apple (32,33) ord.[30] +----------HTC (34,35) ord.[40] +----------Vyacheslav (36,37) ord.[50] +--------Headsets (39,40) ord.[20] +--------Batteries (41,42) ord.[30] +--------Cables_And_Adapters (43,44) ord.[40]
Updating/moving the single node Identical to insertion approach
Node removal Approach from Nested Sets is used.
Getting node children, ordered Now is possible by using (Parent,Order) pair db.categoriesNSO.find({parent:"Electronics"}).sort({order:1}); /* { "_id" : "Cameras_and_Photography", "parent" : "Electronics", "order" : 10, "left" : 2, "right" : 13 } { "_id" : "Shop_Top_Products", "parent" : "Electronics", "order" : 20, "left" : 14, "right" : 23 } { "_id" : "LG", "left" : 24, "right" : 25, "parent" : "Electronics", "order" : 25 } { "_id" : "Cell_Phones_and_Accessories", "parent" : "Electronics", "order" : 30, "left" : 26, "right" : 45 } */
Getting all node descendants Approach from Nested Sets is used.
Getting path to node Approach from nested sets is used
Code in action https://github.com/Voronenko/ https://github.com/Voronenko/Storing_TreeView_Structures_WithMongoDB
Notes on using code All files are packaged according to the following naming convention: MODELReference.js - initialization file with tree data for MODEL approach MODELReference_operating.js - add/update/move/remove/get children examples MODELReference_pathtonode.js - code illustrating how to obtain path to node MODELReference_nodedescendants.js - code illustrating how to retrieve all the descendants of the node All files are ready to use in mongo shell. You can run examples by invoking mongo < file_to_execute, or, if you want, interactively in the shell or with RockMongo web shell.
Thanks! Vyacheslav Voronenko
Check the result +-Electronics (1,44) +--Cameras_and_Photography (2,13) +-----Digital_Cameras (3,4) +-----Camcorders (5,6) +-----Lenses_and_Filters (7,8) +-----Tripods_and_supports (9,10) +-----Lighting_and_studio (11,12) +---Shop_Top_Products (14,23) +-----IPad (15,16) +-----IPhone (17,18) +-----IPod (19,20) +-----Blackberry (21,22) +---Cell_Phones_and_Accessories (24,43) +-----Cell_Phones_and_Smartphones (25,36) +--------Nokia (26,27) +--------Samsung (28,29) +--------Apple (30,31) +--------HTC (32,33) +--------Vyacheslav (34,35) +------Headsets (37,38) +------Batteries (39,40) +------Cables_And_Adapters (41,42)

Storing tree structures with MongoDB

  • 1.
  • 2.
    Introduction In a reallife almost any project deals with the tree structures. Different kinds of taxonomies, site structures etc require modeling of hierarchy relations. Typical approaches used ● Model Tree Structures with Child References ● Model Tree Structures with Parent References ● Model Tree Structures with an Array of Ancestors ● Model Tree Structures with Materialized Paths ● Model Tree Structures with Nested Sets
  • 3.
  • 4.
    Challenges to address Ina typical site scenario, we should be able to ● Operate with tree (insert new node under specific parent, update/remove existing node, move node across the tree) ● Get path to node (for example, in order to be build the breadcrumb section) ● Get all node descendants (in order to be able, for example, to select goods from more general category, like 'Cell Phones and Accessories' which should include goods from all subcategories.
  • 5.
    Scope of thedemo On each of the examples below we: ● Add new node called 'LG' under electronics ● Move 'LG' node under Cell_Phones_And_Smartphones node ● Remove 'LG' node from the tree ● Get child nodes of Electronics node ● Get path to 'Nokia' node ● Get all descendants of the 'Cell_Phones_and_Accessories' node
  • 6.
  • 7.
    Tree structure with parentreference This is most commonly used approach. For each node we store (ID, ParentReference, Order)
  • 8.
    Operating with tree Prettysimple, but changing the position of the node within siblings will require additional calculations. You might want to set high numbers like item position * 10^6 for sorting in order to be able to set new node order as trunc (lower sibling order - higher sibling order)/2 - this will give you enough operations, until you will need to traverse whole the tree and set the order defaults to big numbers again
  • 9.
    Adding new node Goodpoints: requires only one insert operation to introduce the node. var existingelemscount = db.categoriesPCO.find ({parent:'Electronics'}).count(); var neworder = (existingelemscount+1)*10; db.categoriesPCO.insert({_id:'LG', parent:'Electronics', someadditionalattr:'test', order:neworder}) //{ "_id" : "LG", "parent" : "Electronics", // "someadditionalattr" : "test", "order" : 40 }
  • 10.
    Updating / movingthe node Good points: as during insert - requires only one update operation to amend the node existingelemscount = db.categoriesPCO.find ({parent:'Cell_Phones_and_Smartphones'}).count(); neworder = (existingelemscount+1)*10; db.categoriesPCO.update({_id:'LG'},{$set: {parent:'Cell_Phones_and_Smartphones', order:neworder}}); //{ "_id" : "LG", "order" : 60, "parent" : // "Cell_Phones_and_Smartphones", "someadditionalattr" : "test" }
  • 11.
    Node removal Good points:requires single operation to remove the node from tree db.categoriesPCO.remove({_id:'LG'});
  • 12.
    Getting node children,ordered Good points: all childs can be retrieved from database and ordered using single call. db.categoriesPCO.find({$query:{parent:'Electronics'}, $orderby:{order:1}}) //{ "_id" : "Cameras_and_Photography", "parent" : "Electronics", "order" : 10 } //{ "_id" : "Shop_Top_Products", "parent" : "Electronics", "order" : 20 } //{ "_id" : "Cell_Phones_and_Accessories", "parent" : "Electronics", "order" : 30 }
  • 13.
    Getting all nodedescendants Bad points: unfortunately, requires recursive calls to database. var descendants=[] var stack=[]; var item = db.categoriesPCO.findOne({_id:"Cell_Phones_and_Accessories"}); stack.push(item); while (stack.length>0){ var currentnode = stack.pop(); var children = db.categoriesPCO.find({parent:currentnode._id}); while(true === children.hasNext()) { var child = children.next(); descendants.push(child._id); stack.push(child); } } descendants.join(",") //Cell_Phones_and_Smartphones,Headsets,Batteries,Cables_And_Adapters,Nokia, Samsung,Apple,HTC,Vyacheslav
  • 14.
    Getting path tonode Bad points: unfortunately also require recursive operations to get the path. var path=[] var item = db.categoriesPCO.findOne({_id:"Nokia"}) while (item.parent !== null) { item=db.categoriesPCO.findOne({_id:item.parent}); path.push(item._id); } path.reverse().join(' / '); //Electronics / Cell_Phones_and_Accessories / Cell_Phones_and_Smartphones
  • 15.
    Indexes Recommended index ison fields parent and order db.categoriesPCO.ensureIndex( { parent: 1, order:1 } )
  • 16.
    Tree structure withchilds reference For each node we store (ID, ChildReferences).
  • 17.
    Note Please note, thatin this case we do not need order field, because Childs collection already provides this information. Most of languages respect the array order. If this is not in case for your language, you might consider additional coding to preserve order, however this will make things more complicated
  • 18.
    Adding new node Note:requires one insert operation and one update operation to insert the node. db.categoriesCRO.insert({_id:'LG', childs:[]}); db.categoriesCRO.update({_id:'Electronics'},{ $addToSet: {childs:'LG'}}); //{ "_id" : "Electronics", "childs" : [ "Cameras_and_Photography", "Shop_Top_Products", "Cell_Phones_and_Accessories", "LG" ] }
  • 19.
    Updating/moving the node Requiressingle update operation to change node order within same parent, requires two update operations, if node is moved under another parent. Rearranging order under the same parent db.categoriesCRO.update({_id:'Electronics'},{$set:{"childs.1":'LG'," childs.3":'Shop_Top_Products'}}); //{ "_id" : "Electronics", "childs" : [ "Cameras_and_Photography", "LG", "Cell_Phones_and_Accessories", "Shop_Top_Products" ] } Moving the node db.categoriesCRO.update({_id:'Cell_Phones_and_Smartphones'},{ $addToSet: {childs:'LG'}}); db.categoriesCRO.update({_id:'Electronics'},{$pull:{childs:'LG'}}); //{ "_id" : "Cell_Phones_and_Smartphones", "childs" : [ "Nokia", "Samsung", "Apple", "HTC", "Vyacheslav", "LG" ] }
  • 20.
    Node removal Node removalalso requires two operations: one update and one remove. db.categoriesCRO.update ({_id:'Cell_Phones_and_Smartphones'},{$pull: {childs:'LG'}}) db.categoriesCRO.remove({_id:'LG'});
  • 21.
    Getting node children,ordered Bad points: requires additional client side sorting by parent array sequence. Depending on result set, it may affect speed of your code. var parent = db.categoriesCRO.findOne({_id:'Electronics'}) db.categoriesCRO.find({_id:{$in:parent.childs}})
  • 22.
    Getting node children,ordered Result set { "_id" : "Cameras_and_Photography", "childs" : [ "Digital_Cameras", "Camcorders", "Lenses_and_Filters", "Tripods_and_supports", "Lighting_and_studio" ] } { "_id" : "Cell_Phones_and_Accessories", "childs" : [ "Cell_Phones_and_Smartphones", "Headsets", "Batteries", "Cables_And_Adapters" ] } { "_id" : "Shop_Top_Products", "childs" : [ "IPad", "IPhone", "IPod", "Blackberry" ] } //parent: { "_id" : "Electronics", "childs" : [ "Cameras_and_Photography", "Cell_Phones_and_Accessories", "Shop_Top_Products" ] } As you see, we have ordered array childs, which can be used to sort the result set on a client
  • 23.
    Getting all nodedescendants Note: also recursive operations, but we need less selects to databases comparing to previous approach var descendants=[] var stack=[]; var item = db.categoriesCRO.findOne({_id:"Cell_Phones_and_Accessories"}); stack.push(item); while (stack.length>0){ var currentnode = stack.pop(); var children = db.categoriesCRO.find({_id:{$in:currentnode.childs}}); while(true === children.hasNext()) { var child = children.next(); descendants.push(child._id); if(child.childs.length>0){ stack.push(child); } } } //Batteries,Cables_And_Adapters,Cell_Phones_and_Smartphones,Headsets,Apple,HTC,Nokia, Samsung descendants.join(",")
  • 24.
    Getting path tonode Path is calculated recursively, so we need to issue number of sequential calls to database. var path=[] var item = db.categoriesCRO.findOne({_id:"Nokia"}) while ((item=db.categoriesCRO.findOne({childs:item._id}))) { path.push(item._id); } path.reverse().join(' / '); //Electronics / Cell_Phones_and_Accessories / Cell_Phones_and_Smartphones
  • 25.
    Indexes Recommended action isputting index on childs: db.categoriesCRO.ensureIndex( { childs: 1 } )
  • 26.
    Tree structure usingan Array of Ancestors For each node we store (ID, ParentReference, AncestorReferences)
  • 27.
    Adding new node Youneed one insert operation to introduce new node, however you need to invoke select in order to prepare the data for insert var ancestorpath = db.categoriesAAO.findOne ({_id:'Electronics'}).ancestors; ancestorpath.push('Electronics') db.categoriesAAO.insert({_id:'LG', parent:'Electronics', ancestors:ancestorpath}); //{ "_id" : "LG", "parent" : "Electronics", "ancestors" : [ "Electronics" ] }
  • 28.
    Updating/moving the node movingthe node requires one select and one update operation ancestorpath = db.categoriesAAO.findOne ({_id:'Cell_Phones_and_Smartphones'}).ancestors; ancestorpath.push('Cell_Phones_and_Smartphones') db.categoriesAAO.update({_id:'LG'},{$set: {parent:'Cell_Phones_and_Smartphones', ancestors: ancestorpath}}); //{ "_id" : "LG", "ancestors" : [ "Electronics", "Cell_Phones_and_Accessories", "Cell_Phones_and_Smartphones" ], "parent" : "Cell_Phones_and_Smartphones" }
  • 29.
    Node removal is donewith single operation db.categoriesAAO.remove({_id:'LG'});
  • 30.
    Getting node children,unordered Note: unless you introduce the order field, it is impossible to get ordered list of node children. You should consider another approach if you need order. db.categoriesAAO.find({$query:{parent:'Electronics'}})
  • 31.
    Getting all nodedescendants There are two options to get all node descendants. One is classic through recursion: var ancestors = db.categoriesAAO.find({ancestors:" Cell_Phones_and_Accessories"},{_id:1}); while(true === ancestors.hasNext()) { var elem = ancestors.next(); descendants.push(elem._id); } descendants.join(",") //Cell_Phones_and_Smartphones,Headsets,Batteries,Cables_And_Adapters,Nokia, Samsung,Apple,HTC,Vyacheslav
  • 32.
    Getting all nodedescendants second is using aggregation framework introduced in MongoDB 2.2: var aggrancestors = db.categoriesAAO.aggregate([ {$match:{ancestors:"Cell_Phones_and_Accessories"}}, {$project:{_id:1}}, {$group:{_id:{},ancestors:{$addToSet:"$_id"}}} ]) descendants = aggrancestors.result[0].ancestors descendants.join(",") //Vyacheslav,HTC,Samsung,Cables_And_Adapters,Batteries,Headsets,Apple, Nokia,Cell_Phones_and_Smartphones
  • 33.
    Getting path tonode This operation is done with single call to database, which is advantage of this approach. var path=[] var item = db.categoriesAAO.findOne({_id:"Nokia"}) item path=item.ancestors; path.join(' / '); //Electronics / Cell_Phones_and_Accessories / Cell_Phones_and_Smartphones
  • 34.
    Indexes Recommended index isputting index on ancestors: db.categoriesAAO.ensureIndex( { ancestors: 1 } )
  • 35.
    Tree structure usingMaterialized Path For each node we store (ID, PathToNode)
  • 36.
    Intro Approach looks similarto storing array of ancestors, but we store a path in form of string instead. In example above I intentionally use comma(,) as a path elements divider, in order to keep regular expression simpler
  • 37.
    Adding new node Newnode insertion is done with one select and one insert operation var ancestorpath = db.categoriesMP.findOne ({_id:'Electronics'}).path; ancestorpath += 'Electronics,' db.categoriesMP.insert({_id:'LG', path:ancestorpath}); //{ "_id" : "LG", "path" : "Electronics," }
  • 38.
    Updating/moving the node Nodecan be moved using one select and one update operation ancestorpath = db.categoriesMP.findOne ({_id:'Cell_Phones_and_Smartphones'}).path; ancestorpath +='Cell_Phones_and_Smartphones,' db.categoriesMP.update({_id:'LG'},{$set:{path:ancestorpath}}); //{ "_id" : "LG", "path" : "Electronics,Cell_Phones_and_Accessories, Cell_Phones_and_Smartphones," }
  • 39.
    Node removal Node canbe removed using single database query db.categoriesMP.remove({_id:'LG'});
  • 40.
    Getting node children,unordered Note: unless you introduce the order field, it is impossible to get ordered list of node children. You should consider another approach if you need order. db.categoriesMP.find({$query:{path:'Electronics,'}}) //{ "_id" : "Cameras_and_Photography", "path" : "Electronics," } //{ "_id" : "Shop_Top_Products", "path" : "Electronics," } //{ "_id" : "Cell_Phones_and_Accessories", "path" : "Electronics," }
  • 41.
    Getting all nodedescendants Single select, regexp starts with ^ which allows using the index for matching var descendants=[] var item = db.categoriesMP.findOne({_id:"Cell_Phones_and_Accessories"}); var criteria = '^'+item.path+item._id+','; var children = db.categoriesMP.find({path: { $regex: criteria, $options: 'i' }}); while(true === children.hasNext()) { var child = children.next(); descendants.push(child._id); } descendants.join(",") //Cell_Phones_and_Smartphones,Headsets,Batteries,Cables_And_Adapters,Nokia,Samsung, Apple,HTC,Vyacheslav
  • 42.
    Getting path tonode We can obtain path directly from node without issuing additional selects. var path=[] var item = db.categoriesMP.findOne({_id:"Nokia"}) print (item.path) //Electronics,Cell_Phones_and_Accessories, Cell_Phones_and_Smartphones,
  • 43.
    Indexes Recommended index isputting index on path db.categoriesAAO.ensureIndex( { path: 1 } )
  • 44.
    Tree structure using NestedSets For each node we store (ID, left, right).
  • 46.
    Adding new node Pleaserefer to image above. Assume, we want to insert LG node after shop_top_products(14,23). New node would have left value of 24, affecting all remaining left values according to traversal rules, and will have right value of 25, affecting all remaining right values including root one.
  • 47.
    Adding new node Takenext node in traversal tree New node will have left value of the following sibling and right value - incremented by two following sibling's left one Now we have to create the place for the new node. Update affects right values of all ancestor nodes and also affects all nodes that remain for traversal Only after creating place new node can be inserted
  • 48.
    Adding new node varfollowingsibling = db.categoriesNSO.findOne({_id:" Cell_Phones_and_Accessories"}); var newnode = {_id:'LG', left:followingsibling.left,right: followingsibling.left+1} db.categoriesNSO.update({right:{$gt:followingsibling.right}},{$inc:{right: 2}}, false, true) db.categoriesNSO.update({left:{$gte:followingsibling.left}, right:{$lte: followingsibling.right}},{$inc:{left:2, right:2}}, false, true) db.categoriesNSO.insert(newnode)
  • 49.
    Check the result +-Electronics(1,46) +---Cameras_and_Photography (2,13) +------Digital_Cameras (3,4) +------Camcorders (5,6) +------Lenses_and_Filters (7,8) +------Tripods_and_supports (9,10) +------Lighting_and_studio (11,12) +----Shop_Top_Products (14,23) +------IPad (15,16) +------IPhone (17,18) +------IPod (19,20) +------Blackberry (21,22) +----LG (24,25) +----Cell_Phones_and_Accessories (26,45) +------Cell_Phones_and_Smartphones (27,38) +---------Nokia (28,29) +---------Samsung (30,31) +---------Apple (32,33) +---------HTC (34,35) +---------Vyacheslav (36,37) +-------Headsets (39,40) +-------Batteries (41,42) +-------Cables_And_Adapters (43,44)
  • 50.
    Node removal Whilepotentially rearranging node order within same parent is identical to exchanging node's left and right values,the formal way of moving the node is first removing node from the tree and later inserting it to new location. Note: node removal without removing it's childs is out of scope for this article. For now, we assume, that node to remove has no children, i.e. right-left=1 Steps are identical to adding the node - i.e. we adjusting the space by decreasing affected left/right values, and removing original node.
  • 51.
    Node removal var nodetoremove= db.categoriesNSO.findOne({_id:"LG"}); if((nodetoremove.right-nodetoremove.left-1)>0.001) { print("Only node without childs can be removed") exit } var followingsibling = db.categoriesNSO.findOne({_id:"Cell_Phones_and_Accessories"}); //update all remaining nodes db.categoriesNSO.update({right:{$gt:nodetoremove.right}},{$inc:{right:-2}}, false, true) db.categoriesNSO.update({left:{$gt:nodetoremove.right}},{$inc:{left:-2}}, false, true) db.categoriesNSO.remove({_id:"LG"});
  • 52.
    Updating/moving the singlenode Moving the node can be within same parent, or to another parent. If the same parent, and nodes are without childs, than you need just to exchange nodes (left,right) pairs. Formal way is to remove node and insert to new destination, thus the same restriction apply - only node without children can be moved. If you need to move subtree, consider creating mirror of the existing parent under new location, and move nodes under the new parent one by one. Once all nodes moved, remove obsolete old parent. As an example, lets move LG node from the insertion example under the Cell_Phones_and_Smartphones node, as a last sibling (i.e. you do not have following sibling node as in the insertion example)
  • 53.
    Updating/moving the singlenode Steps 1. to remove LG node from tree using node removal procedure described above 2. to take right value of the new parent.New node will have left value of the parent's right value and right value - incremented by one parent's right one. Now we have to create the place for the new node: update affects right values of all nodes on a further traversal path var newparent = db.categoriesNSO.findOne({_id:"Cell_Phones_and_Smartphones"}); var nodetomove = {_id:'LG', left:newparent.right,right:newparent.right+1} //3th and 4th parameters: false stands for upsert=false and true stands for multi=true db.categoriesNSO.update({right:{$gte:newparent.right}},{$inc:{right:2}}, false, true) db.categoriesNSO.update({left:{$gte:newparent.right}},{$inc:{left:2}}, false, true) db.categoriesNSO.insert(nodetomove)
  • 54.
    Check the result +-Electronics(1,46) +--Cameras_and_Photography (2,13) +-----Digital_Cameras (3,4) +-----Camcorders (5,6) +-----Lenses_and_Filters (7,8) +-----Tripods_and_supports (9,10) +-----Lighting_and_studio (11,12) +---Shop_Top_Products (14,23) +-----IPad (15,16) +-----IPhone (17,18) +-----IPod (19,20) +-----Blackberry (21,22) +---Cell_Phones_and_Accessories (24,45) +-----Cell_Phones_and_Smartphones (25,38) +---------Nokia (26,27) +---------Samsung (28,29) +---------Apple (30,31) +---------HTC (32,33) +---------Vyacheslav (34,35) +---------LG (36,37) +-------Headsets (39,40) +-------Batteries (41,42) +-------Cables_And_Adapters (43,44)
  • 55.
    Getting all nodedescendants This is core strength of this approach - all descendants retrieved using one select to DB. Moreover,by sorting by node left - the dataset is ready for traversal in a correct order var descendants=[] var item = db.categoriesNSO.findOne({_id:"Cell_Phones_and_Accessories"}); print ('('+item.left+','+item.right+')') var children = db.categoriesNSO.find({left:{$gt:item.left}, right:{$lt: item.right}}).sort(left:1); while(true === children.hasNext()) { var child = children.next(); descendants.push(child._id); } descendants.join(",") //Cell_Phones_and_Smartphones,Headsets,Batteries,Cables_And_Adapters,Nokia, Samsung,Apple,HTC,Vyacheslav
  • 56.
    Getting path tonode Retrieving path to node is also elegant and can be done using single query to database: var path=[] var item = db.categoriesNSO.findOne({_id:"Nokia"}) var ancestors = db.categoriesNSO.find({left:{$lt:item.left}, right:{$gt: item.right}}).sort({left:1}) while(true === ancestors.hasNext()) { var child = ancestors.next(); path.push(child._id); } path.join('/') // Electronics/Cell_Phones_and_Accessories/Cell_Phones_and_Smartphones
  • 57.
    Indexes Recommended index isputting index on left and right values: db.categoriesAAO.ensureIndex( { left: 1, right:1 } )
  • 58.
    Combination of Nested Setsand classic Parent reference with order approach For each node we store (ID, Parent, Order,left, right).
  • 59.
    Intro Left field alsois treated as an order field, so we could omit order field. But from other hand, we can leave it, so we can use Parent Reference with order data to reconstruct left/right values in case of accidental corruption, or, for example during initial import.
  • 60.
    Adding new node Addingnew node can be adopted from Nested Sets in this manner: var followingsibling = db.categoriesNSO.findOne({_id:"Cell_Phones_and_Accessories"}); var previoussignling = db.categoriesNSO.findOne({_id:"Shop_Top_Products"}); var neworder = parseInt((followingsibling.order + previoussignling.order)/2); var newnode = {_id:'LG', left:followingsibling.left,right:followingsibling.left+1, parent:followingsibling.parent, order:neworder}; db.categoriesNSO.update({right:{$gt:followingsibling.right}},{$inc:{right:2}}, false, true) db.categoriesNSO.update({left:{$gte:followingsibling.left}, right:{$lte: followingsibling.right}},{$inc:{left:2, right:2}}, false, true) db.categoriesNSO.insert(newnode)
  • 61.
    Check the result Beforeinsertion +----Cameras_and_Photography (2,13) ord.[10] +-----Shop_Top_Products (14,23) ord.[20] +-----Cell_Phones_and_Accessories (26,45) ord.[30] After insertion +--Electronics (1,46) +----Cameras_and_Photography (2,13) ord.[10] +-------Digital_Cameras (3,4) ord.[10] +-------Camcorders (5,6) ord.[20] +-------Lenses_and_Filters (7,8) ord.[30] +-------Tripods_and_supports (9,10) ord.[40] +-------Lighting_and_studio (11,12) ord.[50] +-----Shop_Top_Products (14,23) ord.[20] +-------IPad (15,16) ord.[10] +-------IPhone (17,18) ord.[20] +-------IPod (19,20) ord.[30] +-------Blackberry (21,22) ord.[40] +-----LG (24,25) ord.[25] +-----Cell_Phones_and_Accessories (26,45) ord.[30] +-------Cell_Phones_and_Smartphones (27,38) ord.[10] +----------Nokia (28,29) ord.[10] +----------Samsung (30,31) ord.[20] +----------Apple (32,33) ord.[30] +----------HTC (34,35) ord.[40] +----------Vyacheslav (36,37) ord.[50] +--------Headsets (39,40) ord.[20] +--------Batteries (41,42) ord.[30] +--------Cables_And_Adapters (43,44) ord.[40]
  • 62.
    Updating/moving the singlenode Identical to insertion approach
  • 63.
    Node removal Approach fromNested Sets is used.
  • 64.
    Getting node children,ordered Now is possible by using (Parent,Order) pair db.categoriesNSO.find({parent:"Electronics"}).sort({order:1}); /* { "_id" : "Cameras_and_Photography", "parent" : "Electronics", "order" : 10, "left" : 2, "right" : 13 } { "_id" : "Shop_Top_Products", "parent" : "Electronics", "order" : 20, "left" : 14, "right" : 23 } { "_id" : "LG", "left" : 24, "right" : 25, "parent" : "Electronics", "order" : 25 } { "_id" : "Cell_Phones_and_Accessories", "parent" : "Electronics", "order" : 30, "left" : 26, "right" : 45 } */
  • 65.
    Getting all nodedescendants Approach from Nested Sets is used.
  • 66.
    Getting path tonode Approach from nested sets is used
  • 67.
  • 68.
    Notes on usingcode All files are packaged according to the following naming convention: MODELReference.js - initialization file with tree data for MODEL approach MODELReference_operating.js - add/update/move/remove/get children examples MODELReference_pathtonode.js - code illustrating how to obtain path to node MODELReference_nodedescendants.js - code illustrating how to retrieve all the descendants of the node All files are ready to use in mongo shell. You can run examples by invoking mongo < file_to_execute, or, if you want, interactively in the shell or with RockMongo web shell.
  • 69.
    Thanks! Vyacheslav Voronenko
  • 70.
    Check the result +-Electronics(1,44) +--Cameras_and_Photography (2,13) +-----Digital_Cameras (3,4) +-----Camcorders (5,6) +-----Lenses_and_Filters (7,8) +-----Tripods_and_supports (9,10) +-----Lighting_and_studio (11,12) +---Shop_Top_Products (14,23) +-----IPad (15,16) +-----IPhone (17,18) +-----IPod (19,20) +-----Blackberry (21,22) +---Cell_Phones_and_Accessories (24,43) +-----Cell_Phones_and_Smartphones (25,36) +--------Nokia (26,27) +--------Samsung (28,29) +--------Apple (30,31) +--------HTC (32,33) +--------Vyacheslav (34,35) +------Headsets (37,38) +------Batteries (39,40) +------Cables_And_Adapters (41,42)