Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think that is partly why LLMs are bad at math and often fail at counting subsequences. Play with the tokenizer and you see long numbers are split into groups of 2 or 3 numbers.

https://huggingface.co/spaces/Xenova/the-tokenizer-playgroun...



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: