Fair enough, but Imagenet is sort of a nightmare right now. I get it's a crowd funded and sourced effort, but hopefully at some point some brave soul(s) will step up to archive the data as-is in a very reproducible kind of way. :D :))))
isn't this a perfect use case for torrents?
might be too expensive to host for 1 nonprofit company, but collectively.. there might be a few dozen people willing to host it.
don't know about the legality too.. does laion check all the licenses? or they skirt that by using urls?