Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

A character, both input and output.


Not exactly, GPT-3 uses a variant of BPE [1], so one token can correspond to a character, an entire word or more, or anything in between. The paper [2] says a token corresponds to 0.7 words on average.

[1] https://en.wikipedia.org/wiki/Byte_pair_encoding

[2] https://arxiv.org/abs/2005.14165, page 24




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: