Ah, so this is why I suddenly got a bunch of email.
Hey all, site owner here. Thanks for the visits and all the fun stories! I really miss this era of computing. Feel free to let me know if you have something that should be added to the site.
Just fyi, towel.blinkenlights.nl:23 still works for me, though I think maybe that's an IPv6 version, there's a note about ipv6 at the start that I was too slow to read. Maybe it should be re-listed? :)
Writing code that runs down hole or otherwise connects back to the real world would be fun. Maybe I should pickup firmware skills. Good luck with your hiring!
For Vespa there's a managed version hosted by the Vespa company in their cloud environment, and then the open source version is easily run locally or in any environment of your choosing. It takes some attention to detail, but it's quite flexible. I have a long running single node instance on an Intel NUC, but I've also run more complex cluster variations across different cloud environments.
Unrelated to the core topic, I really enjoy the aesthetic of their website. Another similar one is from Fixie.ai (also, interestingly, one of their customers).
This was my first thought too, after reading through their blog. This feels like a no-frills website made by an engineer, who makes things that just work.
The documentation is great, I really appreciate them putting the roadmap front and centre.
Yes, I like the turboxyz123 animation and contrast to the minimalist website (reminds me of the zen garden with a single rock). I think people forget nowadays in their haste to add the latest and greatest react animation, that too much noise is a thing.
For me, it was a bit different, and it comes from a perspective that's a blend of cognitive science and computer science:
Complex systems can be created through the composition of simple processes which are easily explained or modeled. Sometimes there are mysterious emergent properties in the overall system, even when we can explain the components. Other times, through investigation / science / engineering, we are finally able to explain the entire system. It might lose a little of the magic or mystery as a result, but the system itself didn't change. Instead our perspective and understanding changed.
On that note, until we can fully explain some of the workings of our own minds, I'm reluctant to write off "just predicting the next token" as an unimportant process. It's one way to explain LLM inference simply, but it doesn't eliminate the importance. It also doesn't account for as-yet unexplained things which may be happening as a part of training.
Scaling up language models has been shown to predictably improve performance and sample efficiency on a wide range of downstream tasks. This paper instead discusses an unpredictable phenomenon that we refer to as emergent abilities of large language models. We consider an ability to be emergent if it is not present in smaller models but is present in larger models. Thus, emergent abilities cannot be predicted simply by extrapolating the performance of smaller models. The existence of such emergence implies that additional scaling could further expand the range of capabilities of language models.
Version history (for relevant dates):
[v1] Wed, 15 Jun 2022 17:32:01 UTC (59 KB)
[v2] Wed, 26 Oct 2022 05:06:24 UTC (88 KB)
Hey all, site owner here. Thanks for the visits and all the fun stories! I really miss this era of computing. Feel free to let me know if you have something that should be added to the site.
Here's some site meta-history too:
https://telnet.org/history/