More

joefourier · 2026-03-14T15:20:49 1773501649

I'm not from that generation so that's a bit hard for me to understand. Even if you used a closed-source C compiler, wouldn't you still have been able to look at the header file, which would have been pretty self-explanatory?

E.g.

void qsort(void* base, size_t nmemb, size_t size, int (compar)(const void , const void* ));

And surely if you bought a C compiler, you would have gotten a manual or two with it? Documentation from the pre-Internet age tended to be much better than today.

mech422 · 2026-03-14T17:15:40 1773508540

Yeah - but you have to be a good enough programmer to really understand the headers.. the 'bootstrapping' problem was real :-) Especially if you didn't live in a metropolitan/college area. My local library was really short on programming books - especially anything 'in depth'. Also, 'C' was considered a "professional's language" back then - so bookstores/libraries were more likely to have books on BASIC then 'C'

kuboble · 2026-03-15T17:15:16 1773594916

I don't remember where we got the compilers from but we surely didn't buy them.

Also I don't know if it came with manual but my English wasn't good enough to read them anyways.

rustyzig · 2026-03-14T19:50:24 1773517824

What kind of modern C wizardry is that? qsort was:

    qsort(base, nel, width, compar)
    char *base;
    int (*compar)();

joefourier · 2026-03-09T16:45:25 1773074725

Oh I'm happy this has a name now! Even if it's quite silly.

joefourier · 2026-03-09T15:35:15 1773070515

> HN is an interesting place.

How is it interesting that in a forum with thousands of active users, someone posted a comment that disagreed with other comments from 6 years ago?

Even in this thread, there's many different opinions being expressed.

paulddraper · 2026-03-09T17:02:03 1773075723

It's not a single user, rather the majority prevailing opinion/most upvoted comments.

izacus · 2026-03-10T08:53:14 1773132794

Based on what? Your personal observation and memory bias?

joefourier · 2026-03-08T17:21:09 1772990469

That depends what kind of ASIC you’re talking about. Cerebras can run models like GLM 4.7 with 355B parameters.

cubefox · 2026-03-09T07:45:00 1773042300

Cerebras just uses SRAM instead of DRAM. An ASIC would instead hardwire the neural network.

joefourier · 2026-03-09T12:55:50 1773060950

Surely it's more of a spectrum? From a CPU, to a TPU, to a chip that hardwires softmax attention but lets you store arbitrary weights, to one that hardwires the weights directly.

joefourier · 2026-03-06T15:04:31 1772809471

The first professional commercial 4K camera came out over 23 years ago, and the first smartphones and camcorders capable of 4K video were back in 2013.

The Macbook Neo has a 2.5x higher multi-core Geekbench score compared to the i7-4960X's, the top consumer CPU of 2013 (which could handle 4K video editing in h264), and its single-core performance is 5x higher. Plus, I'm 99% sure the MacBook Neo has a dedicated video decoding ASIC anyway.

joefourier · 2026-03-04T15:56:34 1772639794

Fine-tuning still makes sense for cost/latency-sensitive applications. Massive context windows drastically slow down generation, and modern models' performance and instruction following ability relies heavily on a reasoning step that can consume orders of magnitude more tokens than the actual response (depending on the application), while a fine-tuned model can skip/significantly reduce that step.

Using the large model to generate synthetic data offline with the techniques you mentioned, then fine-tuning the small model on it, is an underrated technique.

joefourier · 2026-03-04T10:31:01 1772620261

You don't need 8GB of RAM or less to have memory issues. Cursor + Claude Code + Slack + Discord + Spotify + a few Docker containers + YouTube and a few browser tabs is enough to overwhelm a MacBook with 24GB of RAM.

Right now on my machine, 5 whole docker containers, including two DBs and 3 dev servers, are taking up less RAM than Cursor, a glorified text editor.

And have you looked at RAM prices lately? It's possible that 8GB is all some people can afford.

joefourier · 2026-03-01T12:36:35 1772368595

When did AGI start meaning ASI?

LLMs are artificial general intelligence, as per the Wikipedia definition:

> generalise knowledge, transfer skills between domains, and solve novel problems without task‑specific reprogramming

Even GPT-3 could meet that bar.

bornfreddy · 2026-03-01T16:46:56 1772383616

Wtf? Once it was AI. Then the models started passing the Turing test and calling themselves AI, so we started using AGI to say "truly intelligent machines". Now, as per the definition you quoted, apparently even GPT-3 is AGI, so we now have to use "ASI" to mean "intelligent, but artificial"?

I think I'll just keep using AI and then explain to anyone who uses that term that there is no "I" in today's LLMs, and they shouldn't use this term for some years at least. And that when they can, we will have a big problem.

joefourier · 2026-03-01T19:43:42 1772394222

What's your definition of intelligence? If you exclude LLMs, you might have to exclude quite a few humans as well.

ezst · 2026-03-02T02:29:04 1772418544

LLMs are artificial intelligence illusion engines, they only "reason" as far as there's an already made answer in their dataset that they can retrieve and eventually tweak (when things go best). Take them where there's no training data and give them the new axioms to solve your specific problem and see them fail with incorrect gibberish provided as confident answer. Humans of any level of intelligence wouldn't behave like that.

joefourier · 2026-03-01T12:31:43 1772368303

Tensorflow is largely dead, it’s been years since I’ve seen a new repo use it. Go with Jax if you want a PyTorch alternative that can have better performance for certain scenarios.

nickpsecurity · 2026-03-01T21:31:55 1772400715

Also, TPU support. Hardware diversity.

geon · 2026-03-01T13:35:09 1772372109

Any recommendations for Typescript?

joefourier · 2026-03-01T12:25:29 1772367929

You can actually generate surprisingly coherent text with minimal finetuning of BERT, by reinterpreting it as a diffusion model: https://nathan.rs/posts/roberta-diffusion/

I don’t see a useful definition of LLM that doesn’t include BERT, especially given its historical importance. 340M parameters is only “small” in the sense that a baby whale is small.