Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

"Where possible avoid simply using id as the primary identifier for the table."

Has anyone had trouble by using surrogate primary keys? I've found the opposite of what the author said could be more true: composite keys should be avoided instead.



I think the author was talking about (not) using `id` as the column name, e.g. `user.user_id` instead of `user.id`


ON second reading I think you are correct. I agree on this point then, as it helps when joining through a denormalized table that references multiple PK ids, ex: I think: user_id = role_id is more readable than the likely alternative: u.id = r.id


it also lets you use the join foo using(foo_id) syntax, which I quite like


Generated unique tokens are better than auto increment IDs. They can be created in a distributed multi-master setup, and they don't expose how many of a given record type you have to 3rd parties when exposing primary keys or feeds via APIs.


But also because they won't join incorrectly. That was the biggest win in explaining it to other devs.


I've had lots of problems with surrogate primary keys where people don't put enough constraints on the natural key and then duplicates (sometimes slightly different) get in.

Unless you're using innodb, primary key is not special, it's just another index. If you decide to use surrogate keys you also need to enforce the natural key too (if possible).

There are also some nice performance advantages you can get from natural keys with covering composite indices in the case where you only want the key. But in general a table can have multiple keys in all but the highest levels of normalisation.


That's a good point about manually defining constraints on a table. A composite key forces a developer to think about uniqueness whereas the surrogate key makes that thought process optional.


I disagree with this style point, and I think the author implicitly acknowledges that `id` should be the standard identifier, because further down he points out that the `_id` suffix should be used for columns that make reference to that identifier.


No he doesn't. In an example he actually uses the following:

ON s2.mentor_id = s1.staff_num

staff_num is the primary key

To me this isan anti pattern because: - a primary should (can) never change; - composite primary keys are only useful (save space) in very large datasets

It also conflicts with his pattern: uniform suffix _id


I am working with a 20 table db (not including lookups and session management tables) based on surrogate keys and it sucks ass for discoverability. It also subverts data checking based on SQL REFERENCES. I think natural / composite keys should used to design the db and establish relationships, surrogate keys added only if necessary, and then auto-generated, just like one waits to denormalize until there is a real need.

I suspect the love of surrogate keys is just more MySQL bullshit pretending that classic SQL design is wrong merely because it is not implemented in that particular broken database.


I'm not using mysql. Would you mind expanding what you mean by the struggles you've experienced during discoverability?


I disagree with this too, using table.id is easier to type and consistent when creating joins. It also encourages (forces) to reference the fields with tables prefix.


One advantage of using user_id on both the pk of users as well as the fk side is you can do:

 INNER JOIN comments USING (user_id) 
instead of:

 INNER JOIN comments ON comments.user_id = user.user_id


Which is fine if you are aware of the differences between the two. Since they act differently.


I don't think so, what differences are there? Could this be a database specific thing? In PG it seems equivalent.


In MySQL you can end up with different results.


Maybe what author means is that from the db design standpoint, composite keys are used much less often than they could be. Very often unique table row can be defined by some other column or a pair, on which you will probably put an index anyway.

But that's just playing devil's advocate. In general I think ids are fine because ORMs play nice with them and FKs are simple.


Basically this is the answer - besides that bit about acquiescing to the demands of an ORM.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact