Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Ah, yes, I'd forgotten about Gato. Thank you for reminding me. There's so much research activity that the Gato paper feels as if it was published eons ago. There's only so much I can retain in my puny little human mind at once!

In any case, I'm not sure Gato qualifies as a "large" model with 1.2B parameters -- it's kinda right below the threshold at which it could or would start exhibiting emergent behaviors. Maybe a new Gato with 10's or 100's of billions of parameters operating in the physical world?



Yes. Gato was a good proof-of-concept that the Decision Transformer approach of 'just model literally everything as a sequence' scales well and doesn't exhibit some sort of catastrophic interference and can successfully imitation-learn from all the expert datasets, and a bit of transfer. But they need to push it at least another OOM or 2 to show major transfer, some emergences, and ideally do both from-scratch learning and additional learning on many new tasks. We continue to wait. :(

I hope it didn't all get rolled up into Gemini and become a state secret they'll never publish on again, or lost in the shuffle in the chaos of the DeepMind/Brain merger/liquidation.


> ...lost in the shuffle in the chaos of the DeepMind/Brain merger/liquidation

That's the most likely explanation, in my view.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: