Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It can't internally rewise. The last generation produces a distribution and sometimes the wrong answer gets sampled.

There is no "backspace" token, although it would be cool and fancy if we had that.

The more interesting thing is why does it revise its mistakes. The answer to that is having training examples of fixing your own mistakes in the training data plus some RL to bring out that effect more.



There's been a few attempts at training a backspace token, though.

e.g.:

https://arxiv.org/abs/2502.04404

https://arxiv.org/abs/2306.05426




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact