Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The trouble is if you use actual randomness then you lose repeatability which is an incredibly useful property of computers. Have fun debugging that!

What you want is low precision with stochastic rounding. Graphcore's IPUs have that and it's a really great feature. It lets you use really low precision number formats but effectively "dithers" the error. Same thing as dithering images or noise shaping audio.



Yeah debugging would be a pain, but in the context of inference/training unnecessary. There is some set of ops which requires high precision, if I L2 normalize a tensor - I really need it to be normalized. But matmul/addition? Maybe there is wiggle room.

Big challenge would be whether any gains could compete with the economy of scale from NVidia.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: