The trouble is if you use *actual* randomness then you lose repeatability which ...

lumost · on March 11, 2024

Yeah debugging would be a pain, but in the context of inference/training unnecessary. There is some set of ops which requires high precision, if I L2 normalize a tensor - I really need it to be normalized. But matmul/addition? Maybe there is wiggle room.

Big challenge would be whether any gains could compete with the economy of scale from NVidia.