Closure_tree lets your ActiveRecord models act as nodes in a tree data structure
Common applications include modeling hierarchical data, like tags, page graphs in CMSes, and tracking user referrals.
inetdavid/closure_tree
Substantially more efficient than ancestry and acts_as_tree, and even more awesome than awesome_nested_set, closure_tree has some great features:
- Best-in-class select performance:
- Fetch your whole ancestor lineage in 1 SELECT.
- Grab all your descendants in 1 SELECT.
- Get all your siblings in 1 SELECT.
- Fetch all descendants as a nested hash in 1 SELECT.
- Find a node by ancestry path in 1 SELECT.
- Best-in-class mutation performance:
- 2 SQL INSERTs on node creation
- 3 SQL INSERT/UPDATEs on node reparenting
- Support for Rails 3.0, 3.1, 3.2, 4.0, and 4.1
- Support for reparenting children (and all their descendants)
- Support for concurrency (using with_advisory_lock)
- Support for polymorphism STI within the hierarchy
find_or_create_by_pathfor building out hierarchies quickly and conveniently- Support for deterministic ordering of children
- Support for preordered traversal of descendants
- Support for rendering trees in DOT format, using Graphviz
- Excellent test coverage in a variety of environments
See Bill Karwin's excellent Models for hierarchical data presentation for a description of different tree storage algorithms.
- Installation
- Usage
- Accessing Data
- Polymorphic hierarchies with STI
- Deterministic ordering
- Concurrency
- FAQ
- Testing
- Change log
Note that closure_tree only supports Rails 3.0 and later, and has test coverage for MySQL, PostgreSQL, and SQLite.
-
Add this to your Gemfile:
gem 'closure_tree' -
Run
bundle install -
Add
acts_as_treeto your hierarchical model(s). Make sure you addacts_as_tree*after anyattr_accessibleandself.table_name =lines in your model. Please review the available options you can provide. -
Add a migration to add a
parent_idcolumn to the model you want to act_as_tree. You may want to also add a column for deterministic ordering of children, but that's optional.class AddParentIdToTag < ActiveRecord::Migration def change add_column :tag, :parent_id, :integer end end
Note that if the column is null, the tag will be considered a root node.
-
Add a database migration to store the hierarchy for your model. By default the table name will be the model's table name, followed by "_hierarchies". Note that by calling
acts_as_tree, a "virtual model" (in this case,TagHierarchy) will be added automatically, so you don't need to create it.class CreateTagHierarchies < ActiveRecord::Migration def change create_table :tag_hierarchies, :id => false do |t| t.integer :ancestor_id, :null => false # ID of the parent/grandparent/great-grandparent/... tag t.integer :descendant_id, :null => false # ID of the target tag t.integer :generations, :null => false # Number of generations between the ancestor and the descendant. Parent/child = 1, for example. end # For "all progeny of…" and leaf selects: add_index :tag_hierarchies, [:ancestor_id, :descendant_id, :generations], :unique => true, :name => "tag_anc_desc_udx" # For "all ancestors of…" selects, add_index :tag_hierarchies, [:descendant_id], :name => "tag_desc_idx" end end
-
Run
rake db:migrate -
If you're migrating from another system where your model already has a
parent_idcolumn, runTag.rebuild!and yourtag_hierarchiestable will be truncated and rebuilt.If you're starting from scratch you don't need to call
rebuild!.
Create a root node:
grandparent = Tag.create(:name => 'Grandparent')Child nodes are created by appending to the children collection:
parent = grandparent.children.create(:name => 'Parent')Or by appending to the children collection:
child2 = Tag.new(:name => 'Second Child') parent.children << child2Or by calling the "add_child" method:
child3 = Tag.new(:name => 'Third Child') parent.add_child child3Then:
grandparent.self_and_descendants.collect(&:name) => ["Grandparent", "Parent", "First Child", "Second Child", "Third Child"] child1.ancestry_path => ["Grandparent", "Parent", "First Child"]We can do all the node creation and add_child calls with one method call:
child = Tag.find_or_create_by_path(["grandparent", "parent", "child"])You can find as well as find_or_create by "ancestry paths". Ancestry paths may be built using any column in your model. The default column is name, which can be changed with the :name_column option provided to acts_as_tree.
Note that any other AR fields can be set with the second, optional attributes argument, and as of version 4.2.0, these attributes are added to the where clause as selection criteria.
child = Tag.find_or_create_by_path(%w{home chuck Photos"}, {:tag_type => "File"})This will pass the attribute hash of {:name => "home", :tag_type => "File"} to Tag.find_or_create_by_name if the root directory doesn't exist (and {:name => "chuck", :tag_type => "File"} if the second-level tag doesn't exist, and so on).
Nodes can be moved around to other parents, and closure_tree moves the node's descendancy to the new parent for you:
d = Tag.find_or_create_by_path %w(a b c d) h = Tag.find_or_create_by_path %w(e f g h) e = h.root d.add_child(e) # "d.children << e" would work too, of course h.ancestry_path => ["a", "b", "c", "d", "e", "f", "g", "h"]hash_tree provides a method for rendering a subtree as an ordered nested hash:
b = Tag.find_or_create_by_path %w(a b) a = b.parent b2 = Tag.find_or_create_by_path %w(a b2) d1 = b.find_or_create_by_path %w(c1 d1) c1 = d1.parent d2 = b.find_or_create_by_path %w(c2 d2) c2 = d2.parent Tag.hash_tree => {a => {b => {c1 => {d1 => {}}, c2 => {d2 => {}}}, b2 => {}}} Tag.hash_tree(:limit_depth => 2) => {a => {b => {}, b2 => {}}} b.hash_tree => {b => {c1 => {d1 => {}}, c2 => {d2 => {}}}} b.hash_tree(:limit_depth => 2) => {b => {c1 => {}, c2 => {}}}If your tree is large (or might become so), use :limit_depth.
Without this option, hash_tree will load the entire contents of that table into RAM. Your server may not be happy trying to do this.
to_dot_digraph is suitable for passing into Graphviz.
For example, for the above tree, write out the DOT file with ruby:
File.open("example.dot", "w") { |f| f.write(Tag.root.to_dot_digraph) }Then, in a shell, dot -Tpng example.dot > example.png, which produces:
If you want to customize the label value, override the #to_digraph_label instance method in your model.
Just for kicks, this is the test tree I used for proving that preordered tree traversal was correct:
When you include acts_as_tree in your model, you can provide a hash to override the following defaults:
:parent_column_nameto override the column name of the parent foreign key in the model's table. This defaults to "parent_id".:hierarchy_table_nameto override the hierarchy class name. This defaults to the singular name of the model + "Hierarchy", likeTagHierarchy.:hierarchy_table_nameto override the hierarchy table name. This defaults to the singular name of the model + "_hierarchies", liketag_hierarchies.:dependentdetermines what happens when a node is destroyed. Defaults tonullify.:nullifywill simply set the parent column to null. Each child node will be considered a "root" node. This is the default.:delete_allwill delete all descendant nodes (which circumvents the destroy hooks):destroywill destroy all descendant nodes (which runs the destroy hooks on each child node)
:name_columnused by #find_or_create_by_path, #find_by_path, andancestry_pathinstance methods. This is primarily useful if the model only has one required field (like a "tag").:orderused to set up deterministic ordering
Tag.rootreturns an arbitrary root nodeTag.rootsreturns all root nodesTag.leavesreturns all leaf nodesTag.hash_treereturns an ordered, nested hash that can be depth-limited.Tag.find_by_path(path, attributes)returns the node whose name path ispath. See (#find_or_create_by_path).Tag.find_or_create_by_path(path, attributes)returns the node whose name path ispath, and will create the node if it doesn't exist already.See (#find_or_create_by_path).Tag.find_all_by_generation(generation_level)returns the descendant nodes who aregeneration_levelaway from a root.Tag.find_all_by_generation(0)is equivalent toTag.roots.Tag.with_ancestor(ancestors)scopes to all descendants whose ancestor is in the given list.
tag.rootreturns the root for this nodetag.root?returns true if this is a root nodetag.child?returns true if this is a child node. It has a parent.tag.leaf?returns true if this is a leaf node. It has no children.tag.leavesis scoped to all leaf nodes in self_and_descendants.tag.depthreturns the depth, or "generation", for this node in the tree. A root node will have a value of 0.tag.parentreturns the node's immediate parent. Root nodes will return nil.tag.childrenis ahas_manyof immediate children (just those nodes whose parent is the current node).tag.child_idsis an array of the IDs of the children.tag.ancestorsis a ordered scope of [ parent, grandparent, great grandparent, … ]. Note that the size of this array will always equaltag.depth.tag.ancestor_idsis an array of the IDs of the ancestors.tag.self_and_ancestorsreturns a scope containing self, parent, grandparent, great grandparent, etc.tag.siblingsreturns a scope containing all nodes with the same parent astag, excluding self.tag.sibling_idsreturns an array of the IDs of the siblings.tag.self_and_siblingsreturns a scope containing all nodes with the same parent astag, including self.tag.descendantsreturns a scope of all children, childrens' children, etc., excluding self ordered by depth.tag.descendant_idsreturns an array of the IDs of the descendants.tag.self_and_descendantsreturns a scope of all children, childrens' children, etc., including self, ordered by depth.tag.hash_treereturns an ordered, nested hash that can be depth-limited.tag.find_by_path(path)returns the node whose name path fromtagispath. See (#find_or_create_by_path).tag.find_or_create_by_path(path)returns the node whose name path fromtagispath, and will create the node if it doesn't exist already.See (#find_or_create_by_path).tag.find_all_by_generation(generation_level)returns the descendant nodes who aregeneration_levelaway fromtag.tag.find_all_by_generation(0).to_a==[tag]tag.find_all_by_generation(1)==tag.childrentag.find_all_by_generation(2)will return the tag's grandchildren, and so on.
tag.destroywill destroy a node and do something to its children, which is determined by the:dependentoption passed toacts_as_tree.
Polymorphic models using single table inheritance (STI) are supported:
- Create a db migration that adds a String
typecolumn to your model - Subclass the model class. You only need to add
acts_as_treeto your base class:
class Tag < ActiveRecord::Base acts_as_tree end class WhenTag < Tag ; end class WhereTag < Tag ; end class WhatTag < Tag ; endPlease note that Rails (<= 3.2) doesn't handle polymorphic associations correctly if you use the :type attribute, so this doesn't work:
# BAD: ActiveRecord ignores the :type attribute: root.children.create(:name => "child", :type => "WhenTag")Instead, use either .add_child or children <<:
# GOOD! a = Tag.create!(:name => "a") b = WhenTag.new(:name => "b") a.children << b c = WhatTag.new(:name => "c") b.add_child(c)See issue 43 for more information.
By default, children will be ordered by your database engine, which may not be what you want.
If you want to order children alphabetically, and your model has a name column, you'd do this:
class Tag < ActiveRecord::Base acts_as_tree :order => 'name' endIf you want a specific order, add a new integer column to your model in a migration:
t.integer :sort_orderand in your model:
class OrderedTag < ActiveRecord::Base acts_as_tree :order => 'sort_order' endWhen you enable order, you'll also have the following new methods injected into your model:
tag.siblings_beforeis a scope containing all nodes with the same parent astag, whose sort order column is less thanself. These will be ordered properly, so thelastelement in scope will be the sibling immediately beforeselftag.siblings_afteris a scope containing all nodes with the same parent astag, whose sort order column is more thanself. These will be ordered properly, so thefirstelement in scope will be the sibling immediately "after"self
If your order column is an integer attribute, you'll also have these:
-
The class method
#roots_and_descendants_preordered, which returns all nodes in your tree, pre-ordered. -
node1.self_and_descendants_preorderedwhich will return descendants, pre-ordered. -
node1.prepend_sibling(node2)which will- set
node2to the same parent asnode1, - set
node2's order column to 1 less thannode1's value, and - decrement the order_column of all children of node1's parents whose order_column is <>>= node2's new value by 1.
- set
-
node1.append_sibling(node2)which will- set
node2to the same parent asnode1, - set
node2's order column to 1 more thannode1's value, and - increment the order_column of all children of node1's parents whose order_column is >= node2's new value by 1.
- set
root = OrderedTag.create(:name => "root") a = OrderedTag.create(:name => "a", :parent => root) b = OrderedTag.create(:name => "b") c = OrderedTag.create(:name => "c") # We have to call 'root.reload.children' because root won't be in sync with the database otherwise: a.append_sibling(b) root.reload.children.collect(&:name) => ["a", "b"] a.prepend_sibling(b) root.reload.children.collect(&:name) => ["b", "a"] a.append_sibling(c) root.reload.children.collect(&:name) => ["b", "a", "c"] b.append_sibling(c) root.reload.children.collect(&:name) => ["b", "c", "a"]Several methods, especially #rebuild and #find_or_create_by_path, cannot run concurrently correctly. #find_or_create_by_path, for example, may create duplicate nodes.
Database row-level locks work correctly with PostgreSQL, but MySQL's row-level locking is broken, and erroneously reports deadlocks where there are none. To work around this, and have a consistent implementation for both MySQL and PostgreSQL, with_advisory_lock is used automatically to ensure correctness.
If you are already managing concurrency elsewhere in your application, and want to disable the use of with_advisory_lock, pass :with_advisory_lock => false in the options hash:
class Tag acts_as_tree :with_advisory_lock => false endNote that you will eventually have data corruption if you disable advisory locks, write to your database with multiple threads, and don't provide an alternative mutex.
No. This gem's API is based on the assumption that each node has either 0 or 1 parent.
The underlying closure tree structure will support multiple parents, but there would be many breaking-API changes to support it. I'm open to suggestions and pull requests.
Test fixtures aren't going to be running your after_save hooks after inserting all your fixture data, so you need to call .rebuild! before your test runs. There's an example in the spec tag_spec.rb:
describe "Tag with fixtures" do fixtures :tags before :each do Tag.rebuild! # <- required if you use fixtures endHowever, if you're just starting with Rails, may I humbly suggest you adopt a factory library, rather than using fixtures? Lots of people have written about this already.
This is expected if you aren't using MySQL or Postgresql for your tests.
SQLite doesn't have advisory locks, so we resort to file locking, which will only work if the FLOCK_DIR is set consistently for all ruby processes.
In your spec_helper.rb or minitest_helper.rb, add a before and after block:
before do ENV['FLOCK_DIR'] = Dir.mktmpdir end after do FileUtils.remove_entry_secure ENV['FLOCK_DIR'] endClosure tree is tested under every valid combination of
- Ruby 1.9.3 and Ruby 2.0.0
- The latest Rails 3.0, 3.1, 3.2, and 4.0 branches, and
- MySQL and PostgreSQL. SQLite works in a single-threaded environment.
Assuming you're using rbenv, you can use tests.sh to run the test matrix locally.
Parallelism is not tested with Rails 3.0.x nor 3.1.x due to this known issue.
See https://github.com/mceachen/closure_tree/blob/master/CHANGELOG.md
- https://github.com/collectiveidea/awesome_nested_set
- https://github.com/patshaughnessy/class_factory
- JetBrains, which provides an open-source license to RubyMine for the development of this project.







