Posted on May 29 • Edited on May 30

$graphLookup (Connect By / Recursive Query)

In this series, I present various access patterns for a specific document model. These patterns are supported by a limited set of secondary indexes designed to make queries efficient, without modifying the document schema.

This article explores recursive searches through graph-like relationships between documents (imported in the forst post of this series), with each video in this collection showcasing related content with an array of related videos:

[ { _id: '---U8lzusKE', category: 'Entertainment', relatedVideos: [ 'x9LRHlMdZmA', '5P5nxdJAFdE', 'jdg8Sp1HUKM', 'xdxVBiJe8Co', 'qLSA0gQ9z28', 'WHZPEkZCqwA', 'y3VMhFCLxRc', 'hHjGtBnSv50', '_vx1OVLX5Rc', 'V4LnorVVxfw', 'l56K8eAtCig', 'dHpCoFyMCHU', 'XO5BYR39te8', 'yWy0cuxNWDw', '4SiXdhL7wxU', '5EaZTxQeQMQ', 'mOvmBNLQIi4', 'fa2CvFa2xY8', 'CpbYBZKdi3s', 'lBxzoqTSILc', 'RBumgq5yVrA', 'EoN8RKubbO0', 'zIHQPgz_Iwg', '7PCkvCPvDXk', 't1NVJlm5THo' ], ...

With this structure, I can easily navigate from one video to its related ones, and from there to further related content, effectively building a graph of interconnected videos. There’s no need for an additional index since each video references the "_id" of its related videos, which is always indexed.

Access Patterns: forward traversal of related documents

The following query identifies a video and explores down to three levels of related videos, constructing a graph of connections based on the associated video array. It filters these connections by daily views and restructures the output for improved readability:

db.youstats.aggregate([ { $match: { _id: 'YoB8t0B4jx4' } }, { $graphLookup: { from: "youstats", startWith: "$relatedVideos", connectFromField: "relatedVideos", connectToField: "_id", as: "allRelatedVideos", maxDepth: 3, restrictSearchWithMatch: { "views.daily.data": { $gt: 1e6 } }, depthField: "level" } }, { $project: { _id: 1, title: 1, author: 1, allRelatedVideos: { $map: { input: "$allRelatedVideos", as: "video", in: { number: { $add: [ { $indexOfArray: ["$allRelatedVideos", "$$video"] }, 1 ] }, _id: "$$video._id", title: "$$video.title", author: "$$video.author", level: "$$video.level" } } } } } ])

The execution plan shows the IXSCAN only during the $match stage, but the subsequent iterations utilize the same method.

 stage: 'EXPRESS_IXSCAN', keyPattern: '{ _id: 1 }',

With $graphLookup, you need to have an index on connectToField.

Access Patterns: backward traversal of related documents

To navigate the graph in the opposite direction and find the parent with the _id in its related videos array, an index on that field is essential for quick access. In MongoDB, indexes are created similarly for both scalar fields and arrays:

db.youstats.createIndex({ relatedVideos: 1, _id: 1 });

The following query, where the connectToField is the related videos array, is fast:

db.youstats.aggregate([ { $match: { _id: 'x9LRHlMdZmA' } }, { $graphLookup: { from: "youstats", startWith: "$_id", connectFromField: "_id", connectToField: "relatedVideos", as: "parentVideos", maxDepth: 3, depthField: "level", restrictSearchWithMatch: { "views.daily.data": { $gt: 1e6 } } } } ]);

Using $graphLookup in an aggregation pipeline effectively retrieves a limited number of documents, as long as the work area remains within the 100MB memory limit and results do not exceed the BSON limit of 16MB. For utilizing MongoDB as a document database, consider PuppyGraph (Querying MongoDB Atlas Data as a Graph). The same indexes allow fast recursive search and can be on a scalar identifier or an array or of child, depending if you implemented the one-to-many in the one-side or many-side.

DEV Community

$graphLookup (Connect By / Recursive Query)

Access Patterns: forward traversal of related documents

Access Patterns: backward traversal of related documents

Top comments (0)