DEV Community

Dmitry Daw
Dmitry Daw

Posted on

How to query Ancestry fast

Ancestry is a great library to organize models in a tree structure.

For example, if you have Location model, which may be City, State, and Country, and you want to easily organize and query between all those entities.

By default Ancestry makes requests with LIKE query, e.g.

SELECT locations.* FROM locations WHERE "locations"."ancestry" LIKE '4a5b7a5d-1147-4d09-8a5d-a1a476c5be50/88fd88b7-eecd-4fc0-b18d-e2b6a077ab66/550f9809-e875-4fab-8cac-db8e6d59e0e5/%' OR "locations"."ancestry" = '4a5b7a5d-1147-4d09-8a5d-a1a476c5be50/88fd88b7-eecd-4fc0-b18d-e2b6a077ab66/550f9809-e875-4fab-8cac-db8e6d59e0e5' 
Enter fullscreen mode Exit fullscreen mode

Sometimes you need to write a query to check all descendants, like

SELECT locations.*, count(distinct cities.id) FROM locations LEFT JOIN locations as cities ON cities.ancestry LIKE '%' || locations.id || '%' WHERE cities.location_type = 'CITY' GROUP BY locations.id 
Enter fullscreen mode Exit fullscreen mode

(in this example locations.id is uuid, so we can use simplified LIKE query with '%' || locations.id || '%'. With int id it requires a different request)

But this request is running too long.
PostgreSQL does have indexes that may speed up LIKE queries, for example GIN and GiST.

But before using index, we can change that query to this

SELECT locations.*, count(distinct cities.id) FROM locations LEFT JOIN locations as cities ON locations.id = ANY((string_to_array(cities.ancestry, '/')::uuid[])) WHERE cities.location_type = 'CITY' GROUP BY locations.id 
Enter fullscreen mode Exit fullscreen mode

And it runs MUCH faster. On our data, we get an improvement from 5 seconds to 6 ms(!).

And it's possible to use it with int ids, just change uuid[] to int[].

After we can also add GIN index to it, if we want to.

add_index :locations, "(string_to_array(ancestry, '/')::uuid[])", using: :gin 
Enter fullscreen mode Exit fullscreen mode

But it already runs much faster.

Another way to speed up Ancestry queries is to use ltree index, but it requires a diffent syntax.

Sources:

string_to_array: https://github.com/stefankroes/ancestry/issues/466
GIN and GiST index documentation: https://www.postgresql.org/docs/9.1/textsearch-indexes.html

ltree: https://github.com/stefankroes/ancestry/issues/102
ltree index documentation: https://www.postgresql.org/docs/9.1/ltree.html

Top comments (0)