More

skybrian · 2026-04-07T01:31:33 1775525493

I don't think it's all that hard to avoid working on anything shady. It's not as easy to avoid being associated with anything shady due to widespread cynicism and a tendency to treat tech companies with thousands of projects as a monolith.

skybrian · 2026-04-07T01:11:58 1775524318

Why should we have strong priors in either direction? Maybe it will keep scaling for decades like Moore's law. Maybe not.

skybrian · 2026-04-07T00:17:10 1775521030

I guess gigawatts is how we roughly measure computing capacity at the datacenter scale? Also saw something similar here:

> Costs and pricing are expressed per “token”, but the published data immediately seems to admit that this is a bad choice of unit because it costs a lot more to output a token than input one. It seems to me that the actual marginal quantity being produced and consumed is “processing power”, which is apparently measured in gigawatt hours these days. In any case, I think more than anything this vindicates my original decision not to get too precise. [...]

https://backofmind.substack.com/p/new-new-rules-for-the-new-...

Is it priced that way, though? I assume next-gen TPU's will be more efficient?

nomel · 2026-04-07T00:50:55 1775523055

> but the published data immediately seems to admit that this is a bad choice of unit because it costs a lot more to output a token than input one

And, that's silly, because API pricing is more expensive for output than input tokens, 5x so for Anthropic [1], and 6x so for OpenAI!

[1] https://platform.claude.com/docs/en/about-claude/pricing

[2] https://openai.com/api/pricing

AlphaSite · 2026-04-07T02:40:39 1775529639

I think for the same model wall time is probably a more intuitive metric; at the end of the day what you’re doing is renting GPU time slices.

Large outputs dominate compute time so are more expensive.

IMO input and output token counts are actually still a bad metric since they linearise non linear cost increases and I suspect we’ll see another change in the future where they bucket by context length. XL output contexts may be 20x more expensive instead of 10x.

nsomaru · 2026-04-07T03:39:48 1775533188

They already bucket when context goes above 200k

refulgentis · 2026-04-07T04:55:21 1775537721

No longer

brokencode · 2026-04-07T00:34:13 1775522053

Gigawatts seems like more a statement of the power supply and dissipation of the actual facility.

I’m assuming you can cram more chips in there if you have more efficient chips to make use of spare capacity?

Trying to measure the actual compute is a moving target since you’d be upgrading things over time, whereas the power aspects are probably more fixed by fire code, building size, and utilities.

delichon · 2026-04-07T01:18:00 1775524680

Measuring data centers in watts is like measuring cars in horsepower. Power isn't a direct measure of performance, but of the primary constraint on performance. When in doubt choose the thermodynamic perspective.

pepperoni_pizza · 2026-04-07T07:08:20 1775545700

Gigawatts are units of power, gigawatthours are units of energy.

The equivalent of cars would be pricing by how much gas you burned, not horsepower.

stingraycharles · 2026-04-07T02:14:56 1775528096

I mean a single nuclear reactor delivers around 1GW, so if a single datacenter consumes multiple of those, it gives a reasonably accurate idea of the scale.

twoodfin · 2026-04-07T01:07:10 1775524030

That these data centers can turn electricity + a little bit of fairly simple software directly into consumer and business value is pretty much the whole story.

Compare what you need to add to AWS EC2 to get the same result, above and beyond the electricity.

zozbot234 · 2026-04-07T01:23:08 1775524988

That's a convenient story, but most consumers' and businesses' use of AI is light enough that they could easily run local models on their existing silicon. Resorting to proprietary AI running in the datacenter would only add a tiny fraction of incremental value over that, and at a significant cost.

twoodfin · 2026-04-07T01:57:14 1775527034

Sure but where the puck is going is long-running reasoning agents where local models are (for the moment) significantly constrained relative to a Claude Opus 4.6.

astral_drama · 2026-04-07T02:20:21 1775528421

I'm looking forward to running a Gemma 4 turboquant on my 24GB GPU. The perf looks impressive for how compact it is.

I often get a 10x more cost effective run processing on my local hardware.

Still reaching for frontier models for coding, but find the hosted models on open router good enough for simple work.

Feels like we are jumping to warp on flops. My cores are throttled and the fiber is lit.

skybrian · 2026-04-06T19:21:04 1775503264

Any ideas for locking down remote access from an untrusted VM? Cloudflare has object-based capabilities and some similar thing might be useful to let a VM make remote requests without giving it API keys. (Keys could be exfiltrated via prompt injection.)

benswerd · 2026-04-06T19:25:03 1775503503

So we have there are 3 solutions to this, Freestyle supports 2 of them: 1. Freestyle supports multiple linux users. All linux users on the VM are locked down, so its safe to have a part of the vm that has your secret keys/code that the other parts cannot access. 2. A custom proxy that routes the traffic with the keys outside 3. We're working on a secrets api to intercept traffic and inject keys based on specific domains and specific protocols starting with HTTP Headers, HTTP Git Authentication and Postgres. That'll land in a few weeks.

skybrian · 2026-04-06T17:58:39 1775498319

Functional languages have some good and some bad features and there's no reason to copy them all. For example, you don't need to have a Hindley-Milner type system (bidirectional is better) or currying just because it's a functional language.

troupo · 2026-04-06T19:17:31 1775503051

We need more pragmatic languages. E.g. Erlang and Elixir are functional, but eschew all the things FP purists advocate for (complex type systems, purity, currying by default etc.)

rapind · 2026-04-06T21:37:39 1775511459

If you like Erlang, Elixir, and Elm/Haskell, then Gleam + Lustre (which is TEA) is a pretty great fit.

zem · 2026-04-06T20:21:22 1775506882

ocaml has a complex type system but it's also very pragmatic in that it doesn't force you into any one paradigm, you can do whatever works best in a given situation. (scala arguably goes further in the "do whatever you want" direction but it also dials the complexity way up)

troupo · 2026-04-06T20:32:54 1775507574

Yes! Completely forgot about OCaml because I only spent a couple of months with it

skybrian · 2026-04-06T17:53:39 1775498019

Having signed up for the New York Times recently, they're surprisingly hostile towards new customers:

- Autoplaying videos on the front page with no pause button. I expect video from CNN, but not a newspaper. That's not what I'm there for.

- They send you many "introductory" emails with no way to unsubscribe.

I mostly gave up on the front page, but it's marginally useful for reading the occasional article linked to from elsewhere.

skybrian · 2026-04-06T17:45:32 1775497532

It doesn't seem very easy to calculate how much it would cost per month to keep a mostly-idle VM running (for example, with a personal web app). The $20/month plan from exe.dev seems more hobbyist-friendly for that. Maybe that's not the intended use, though?

benswerd · 2026-04-06T17:56:37 1775498197

We're not going after hobbyists. We're building the platform for companies like exe.dev to build on. Thats why its all usage based.

That said, our $50 a month plan can be used as an individual for your coding agents, but I wouldn't recommend it.

indigodaddy · 2026-04-06T18:15:59 1775499359

Ooof, if you are the middleman platform then it's sure gonna get expensive for the end user

rvz · 2026-04-06T18:38:51 1775500731

> The $20/month plan from exe.dev seems more hobbyist-friendly for that. Maybe that's not the intended use, though?

And you can go even below that by self-hosting it yourself with a very cheap Hetzner box for $2 or $5.

skybrian · 2026-04-06T19:01:49 1775502109

Can you start up multiple VM's easily on a Hetzner box?

skybrian · 2026-04-05T23:10:50 1775430650

The reason it matters is that if they are making a profit on inference, then when people use their services more, it cuts their losses. They might even break even eventually and start making a profit without raising the price.

But if they're losing money on inference, they will lose more money when people use their services more. There's no way to turn that around at that price.

drawfloat · 2026-04-06T08:37:47 1775464667

We don't even have any evidence inference excluding training is actually profitable.

skybrian · 2026-04-05T15:37:10 1775403430

It's great for them, but I'm not really into reaction videos. Pictures taken by space probes are just as good as far as I'm concerned.

thomashabets2 · 2026-04-05T17:55:37 1775411737

It's not for the reason that the parent commenter said, and it's not the moon (yet), but you can't take photos like this with probes alone: https://www.theguardian.com/artanddesign/gallery/2026/apr/05...

skybrian · 2026-04-05T02:57:58 1775357878

This essay is asking a personal question: what do you care about?

But when you're on the job, you're getting paid to do some work by other people. So if you care about something, you have to communicate why it's important to other people - that is, make it legible.