Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You are wrong, there can be a circuit to count letters because it can easily normalize them internally, as we know it can transform text to base64 just fine. So there is no reason there can't be a circuit to count letters.

The training just is too dumb to create such a circuit even with all that massive data input, but its super easy for a human to make such a neural net with those input tokens. Its just a kind of problem that transformers are exceedingly bad at solving, so they don't learn it very well even though its a very simple computation for them to do.



Transformers have a limited computation budget related to the size of the context, so it can get better at math the longer the conversation is.



Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact