More details in the previous blog post: https://adamkarvonen.github.io/machine_l... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		anotherjesse on March 26, 2024 \| parent \| context \| favorite \| on: Manipulating Chess-GPT's World Model More details in the previous blog post: https://adamkarvonen.github.io/machine_learning/2024/01/03/c... > A 50 million parameter GPT trained on 5 million games of chess learns to play at ~1300 Elo in one day on 4 RTX 3090 GPUs. And from the paper: https://arxiv.org/abs/2403.15498 > The 25M parameter model took 72 hours to train on one RTX 3090 GPU. The 50M parameter model took 38 hours to train on four RTX 3090 GPUs. definitely inspiring :)

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact