> A 50 million parameter GPT trained on 5 million games of chess learns to play at ~1300 Elo in one day on 4 RTX 3090 GPUs.
And from the paper: https://arxiv.org/abs/2403.15498
> The 25M parameter model took 72 hours to train on one RTX 3090 GPU. The 50M parameter model took 38 hours to train on four RTX 3090 GPUs.
definitely inspiring :)
> A 50 million parameter GPT trained on 5 million games of chess learns to play at ~1300 Elo in one day on 4 RTX 3090 GPUs.
And from the paper: https://arxiv.org/abs/2403.15498
> The 25M parameter model took 72 hours to train on one RTX 3090 GPU. The 50M parameter model took 38 hours to train on four RTX 3090 GPUs.
definitely inspiring :)