More

BoredomIsFun · 2026-04-03T04:45:21 1775191521

> An LLM is a router and completely stateless aside from the context you feed into it.

Not the latest SSM and hybrid attention ones.

graemefawcett · 2026-04-04T04:40:50 1775277650

Stateless router to router with lossy scratchpad is a step up, still not going to ask it to check my Lisp. That's what linters are for

BoredomIsFun · 2026-04-03T04:42:21 1775191341

good old illustrtation: https://www.ml6.eu/en/blog/large-language-models-to-fine-tun...

The it- one is the yellow smiling dot, the pt- is the rightmost monster head.

BoredomIsFun · 2026-03-31T20:37:27 1774989447

> If I offend anyone I will not be apologising for it.

What you said is simply counterfactual, so no reason to be offended.

BoredomIsFun · 2026-03-31T20:32:56 1774989176

Asimov is a widespread lastname in ex-USSR, esp. Central Asia. I personally know three unrelated Asimovs.

BoredomIsFun · 2026-03-27T09:34:24 1774604064

> Local model enthusiasts often assume that running locally is more energy efficient than running in a data center,

It is a well known 101 truism in /r/Localllama that local is rarely cheaper, unless run batched - then it is massively, 10x cheaper indeed.

> I think they mean that the DeepSeek API charges are less than it would cost for the electricity to run a local model.

Because it is hosted in China, where energy is cheap. In ex-USSR where I live it is inexpensive too, and keeping in mind that whole winter I had to use small space heater, due to inadequacy of my central heating, using local came out as 100% free.

BoredomIsFun · 2026-03-26T06:40:12 1774507212

Hmm...no. These two things are orthogonal. Regardless, Olmo are opensource.

BoredomIsFun · 2026-03-25T08:56:56 1774429016

1. What makes you think it is written by an LLM

2. Where is that rule, could you cite it?

3. How dow I know you did not use LLM for your comment?

messe · 2026-03-25T21:37:36 1774474656

1. Word choice, phrasing, and sentence structure make it seem likely. Ironically, one has to go on vibes. One gets a feel for the voice and tone used by LLMs after a while. It's also a new account with one comment.

2. "Don't post generated comments or AI-edited comments. HN is for conversation between humans." From https://news.ycombinator.com/newsguidelines.html

3. You don't.

BoredomIsFun · 2026-03-26T06:27:26 1774506446

1 and 3 contradict each other. Last thing people need is anti-AI hysteria.

BoredomIsFun · 2026-03-22T15:00:46 1774191646

> the API-driven $trillion labs?

here we go: https://huggingface.co/collections/trillionlabs/tri-series

BoredomIsFun · 2026-03-19T12:47:28 1773924448

please post it on /r/localllama

BoredomIsFun · 2026-03-19T12:44:16 1773924256

Phi-4-14b with layers duplicated (phi-4-25b) has increassed performance. Phi-4-49b has degraded vs 14b.