Non-Relational Data in Ash

Introduction

Ash seems to be great with regular relational data. However when I tried it out for a hobby project, where I had to represent a hierarchy of arbitrary depth, I noticed there isn’t any support for graphs or any other non relational data. I propose that data layers for some NoSQL DBs/Postgres Extension be created.

Now, I firmly believe that 90% of modern requirements can be met with a regular RDBMS. It take a very significant load and a number of optimizations elsewhere in the stack before your database becomes a bottleneck. However, there are cases where you genuinely need more than vanilla postgres or even a specialized DB. As such, my approach would be to have as little information outside of standard postgres as possible, while also limiting the number of extra services my app depends on.

Postgres and friends

One approach to enforce this principle is to stick to postgres extensions like Apache AGE (for graphs), TimescaleDB (for time-series) and JSONB (for documents). Since we are only extending the functionality of our primary database, there are 2 main of advantages:

  • No consistency problems/data duplication between DBs (since there’s only one DB)
  • Simplified deployment and reduced footprint (since there’s only one service)

However, this raises the important question: do MySQL and SQLite become second class citizens compared to Postgres in Ash? It’s possible that MySQL has equivalents for all these extensions, and we can support these features on top of AshPostgres and AshMySQL.

NoSQL DBs

The disadvantage with Postgres Extensions is that they may or may not be as performant as specialized solutions. Additionally they tend to have a smaller user base than specialized solutions. Supporting NoSQL DBs though, is a difficult task:

  • How do we decide how much data to duplicate between DBs?
  • How do we ensure consistency?
  • When does an implementation become too opinionated and actively hinder the user when he steps off the beaten path?
  • Are Postgres, MySQL and SQLite even replaceable (as they are now) when used with conjunction with a NoSQL DB?

I personally think that integrating additional databases is a problem best solved on a case-by-case basis. I don’t even think Ash trying to solve for the scale and the unique cases that demand a NoSQL database is even right thing to do, since the vast majority of Ash’s users don’t and will not operate at that scale.

Further Steps

The main question to answer is if we want to go with supporting the database extensions, or the NoSQL DBs or both and we fill in the details after that.

If we were to go with just Postgres, the order I would implement these would be:

  1. jsonb (already supported) → Document DB
  2. timescaledb (since it’s essentially a regular table, but optimized for time) → Time-Series DB
  3. pgvector (since it’s just an extra datatype with a few vector specific extensions to SQL) → Vector DB
  4. age (since graphs are queried with Cypher and not SQL) → Graph DB
  5. postgis (since a lot of specialized GIS expertise is required) → Geospatial DB

I’m happy to help if this a direction Ash Core wants to go.

If there’s demand, we could support hstore as a Redis replacement, but I’m not sure for what’s the point in the context of Elixir (or even if you need it anyway when you’ve got jsonb).

2 Likes

Not to take away from your overall point, but can you explain what you mean by this? A relational database (e.g. Postgres) should be more than capable of modeling a tree (“hierarchy”) or graph. Does the problem you ran into have to do with Ash specifically?

1 Like

I wanted to model the sub-organizations in a parent organization - each group could be split into smaller groups. In essence this a tree with unbounded depth. In retrospect, I could have handled this with CTEs or ltree.

I could probably pull off ltree with Ash or Ecto fragments.

Again, the main point is we could do with some data layers for non-tabular data. This will take time, and is a tricky problem to solve, but I think it has a lot of value.

1 Like

What about this one? Guide — AshNeo4j v0.2.2

Or this one?

Or this one?

Or this one?

Or the builtin ETS/Mnesia data layers?

We have quite a few data layers for non-relational data.

1 Like

If you’re referring to relationships not being able to return nested graph (i.e nested) structures, I think that’s less of a limitation than you realize. Data layers also have calculations at their disposal, and we could add various things to make that more ergonomic if/when people need/want it.

I was unaware of these since they’re not under the ash-project organization. Maybe you should link to them from the main repo.

We link to the ones we or someone on our team maintains, but not necessarily to community related ones. Best way to explore packages that might fit that bill is with an awesome list like:

or with hex

3 Likes

Hmm… I didn’t think of looking for an awesome list. I just assumed that community packages didn’t exist yet.

1 Like