Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

More or less. It was a string given its own token by the tokeniser because of the above, but it did not appear in the training data. Thus it basically had no meaning for the LLM (I think there are some theories that such parts of the networks associated with such tokens may have been repurposed for something else and so that's why the presense of the token in the input messed them up so much)


gpt-oss has similar bad tokens.

https://fi-le.net/oss/




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact