More

adam_arthur · 2026-03-09T01:25:42 1773019542

Oil futures (months out) are priced lower than spot, presumably due to anticipation that Iran driven disruption to the market will be short lived.

(Remains to be seen whether that's true)

refulgentis · 2026-03-09T01:30:54 1773019854

adam_arthur · 2026-02-27T16:21:42 1772209302

Yeah, but a large monorepo can consist of many small subprojects. And arguably this is becoming a best practice.

Just spawn the agent in one of the subprojects

adam_arthur · 2026-02-02T19:07:23 1770059243

LLMs have clearly accelerated development for the most skilled developers.

Particularly when the human acts as the router/architect.

However, I've found Claude Code and Co only really work well for bootstrapping projects.

If you largely accept their edits unchanged, your codebase will accrue massive technical debt over time and ultimately slow you down vs semi-automatic LLM use.

It will probably change once the approach to large scale design gets more formalized and structured.

We ultimately need optimized DSLs and aggressive use of stateless sub-modules/abstractions that can be implemented in isolation to minimize the amount of context required for any one LLM invocation.

Yes, AI will one shot crappy static sites. And you can vibe code up to some level of complexity before it falls apart or slows dramatically.

lowbloodsugar · 2026-02-02T19:23:52 1770060232

>If you largely accept their edits unchanged, your codebase will accrue massive technical debt over time and ultimately slow you down vs semi-automatic LLM use.

Worse, as its planning the next change, it's reading all this bad code that it wrote before, but now that bad code is blessed input. It writes more of it, and instructions to use a better approach are outweighed by the "evidence".

Also, it's not tech debt: https://news.ycombinator.com/item?id=27990979#28010192

adam_arthur · 2026-02-02T19:31:27 1770060687

People can take on debt for all sorts of things. To go on vacation, to gamble.

Debt doesn't imply it's productively borrowed or intelligently used. Or even knowingly accrued.

So given that the term technical debt has historically been used, it seems the most appropriate descriptor.

If you write a large amount of terrible code and end up with a money producing product, you owe that debt back. It will hinder your business or even lead to its collapse. If it were quantified in accounting terms, it would be a liability (though the sum of the parts could still be net positive)

Most "technical debt" is not buying the code author anything and is materialized through negligence rather than intelligently accepting a tradeoff

lowbloodsugar · 2026-02-02T21:18:28 1770067108

All those examples were borrowing money. What you're describing as "technical debt" doesn't involve borrowing anything. The equivalent for a vacation would be to take your kids to a motel with a pool and dress up as Mickey Mouse and tell them its "Disney World debt". You didn't go in debt. You didn't go to Disney World. You just spent what money you do have on a shit solution. Your kids quite possibly had fun, even.

> term technical debt has historically been used

There are plenty of terms that we no longer use because they cause harm.

Sohcahtoa82 · 2026-02-02T21:49:31 1770068971

Agreed.

What I've found is that AI can be alright at creating a Proof of Concept for an app idea, and it's great as a Super Auto-complete, but anything with a modicum of complexity, it simply can't handle.

When your code is hundreds of thousands of lines, asking an agent to fix a bug or implement a feature based on a description of the behavior just doesn't work. The AI doesn't work on call graphs, it basically just greps for strings it thinks might be relevant to find things. If you know exactly where the bug lies, it can usually find it with context given to it, but at that point, you're just as good fixing the bug yourself rather than having the AI do it.

The problem is that you have non-coders creating a PoC, then screaming from the rooftops how amazing AI is and showing off what it's done, but then they go quiet as the realization sets in that they can't get the AI to flesh it out into a viable product. Alternatively, they DO create a product that people start paying to use, and then they get hacked because the code is horribly insecure and hard-codes API keys.

athenot · 2026-02-02T19:22:25 1770060145

> We ultimately need optimized DSLs and aggressive use of stateless sub-modules/abstractions that can be implemented in isolation to minimize the amount of context required for any one LLM invocation.

Containment of state also happens to benefit human developers too, and keep complexity from exploding.

adam_arthur · 2026-02-02T19:26:31 1770060391

Yes!

I've found the same principles that apply to humans apply to LLMs as well.

Just that the agentic loops in these tools aren't (currently) structured and specific enough in their approach to optimally bound abstractions.

At the highest level, most applications can be written in simple, plain english (expressed via function names). Both humans and LLMs will understand programs much better when represented this way

AndreasMoeller · 2026-02-03T13:04:19 1770123859

The most interesting thing for me is that I am sure it does.

I have been coding for 20+ years and I have used AI agents for coding a lot, especially for the last month and a half. I can't say for sure they make me faster.They definitely do for some tasks, but over all? I can solve some tasks really quickly, but at the same time my understanding of the code is not as good as it was before. I am much less confident that is is correct.

LLMs clearly make junior and mid level engineers faster, but it is much harder to say for Senior.

CuriouslyC · 2026-02-02T19:29:51 1770060591

Valknut is pretty good at forcing agents to build more maintainable codebases. It helps them dry out code, separate concerns cohesively and organize complexity. https://github.com/sibyllinesoft/valknut

krainboltgreene · 2026-02-02T21:56:05 1770069365

> LLMs have clearly accelerated development for the most skilled developers.

Have they so clearly? What's the evidence?

thegrim000 · 2026-02-02T23:14:29 1770074069

Most people's "truth" nowadays is what they've heard enough people say is true. Not objective data/measures. What people believe is true, and say is true, IS truth, to them.

themafia · 2026-02-02T19:34:33 1770060873

> accrue massive technical debt

The primary difference between a programmer and an engineer.

sjdixjjxs · 2026-02-02T20:51:28 1770065488

> We ultimately need optimized DSLs and aggressive use of stateless sub-modules/abstractions that can be implemented in isolation to minimize the amount of context required for any one LLM invocation

Wait till you find out about programming languages and libraries!

> It will probably change once the approach to large scale design gets more formalized and structured

This idea has played out many times over the course of programming history. Unfortunately, reality doesn’t mesh with our attempts to generalize.

adam_arthur · 2026-02-01T03:07:20 1769915240

Relying on the model for security is not security at all.

No amount of hardening or fine-tuning will make them immune to takeover via untrusted context

adam_arthur · 2026-01-22T16:50:06 1769100606

Comment definitely reads like AI

adam_arthur · 2025-12-24T21:34:05 1766612045

Implies they will pay cash value to equity holders as opposed to issuing NVDA shares.

(Electronically)

SecretDreams · 2025-12-24T22:42:43 1766616163

Is this to somehow screw the employees with RSUs or what?

wmf · 2025-12-24T22:45:25 1766616325

No, it doesn't really matter if they pay in cash or stock. If you think NVDA has room to run you're welcome to use your buyout money to buy NVDA on the open market.

SecretDreams · 2025-12-24T23:31:55 1766619115

Well, this isn't framed as a buyout/takeover, so I was curious how existing RSUs would be cashed out?

This deal is framed as IP transfer and talent transfer without owning the full company. Probably to skirt anti trust, among other things.

adam_arthur · 2025-12-24T23:42:56 1766619776

I'm not sure in this specific case. They could choose to pay the employees some portion of the funds.

If not, the owners are likely liable to be sued for "selling in effect" without paying equity holders.

Presuming the company becomes a defacto subsidiary of Nvidia (even if not legally so)

My guess, without researching it, is they will compensate existing equity holders to avoid that possibility. I mean the valuation multiple is enormous, it's worth it simply to derisk the legal aspect.

wmf · 2025-12-24T23:44:45 1766619885

For vested RSUs it's likely that the Groq husk will pay out the $20B as a dividend or buyback or something. I don't know if unvested RSUs are accelerated or just canceled. Of course the employees will receive new RSUs when they join Nvidia.

adam_arthur · 2025-12-18T18:30:53 1766082653

Local tools/skills/function definitions can already invoke any API.

There's no real benefit to the MCP protocol over a regular API with a published "client" a local LLM can invoke. The only downside is you'd have to pull this client prior.

I am using local "skill" as reference to an executable function, not specifically Claude Skills.

If the LLM/Agent executes tools via code in a sandbox (which is what things are moving towards), all LLM tools can be simply defined as regular functions that have the flexibility to do anything.

I seriously doubt MCP will exist in any form a few years from now

adam_arthur · 2025-11-28T21:55:07 1764366907

Storage doesn't require the same capex/upfront investment to get that margin.

How much does it cost to train a cutting edge LLM? Those costs need to be factored into the margin from inferencing.

Buying hard drives and slotting them in also has capex associated with it, but far less in total, I'd guess.

conradev · 2025-11-28T23:22:52 1764372172

  How much does it cost to train a cutting edge LLM? Those costs need to be factored into the margin from inferencing.

They don't, though! I can buy hardware off of the shelf, host open source models on it, and then charge for inference:

https://parasail.io, https://www.baseten.co

adam_arthur · 2025-11-28T23:33:59 1764372839

Yes, which is why the companies that develop the models aren't cost viable. (Google and others who can subsidize it at a loss obviously are excepted)

Where is the return on the model development costs if anybody can host a roughly equivalent model for the same price and completely bypass the model development cost?

Your point is inline with the entire bear thesis on these companies.

For any use cases which are analytical/backend oriented, and don't scale 1:1 with number of users (of which there are a lot), you can already run a close to cutting edge model on a few thousand dollars of hardware. I do this at home already

wyre · 2025-11-29T00:19:08 1764375548

Open source models are still a year or so behind the SotA models released the last few months. The price to performance is definitely in favor of Open Source models however.

DeepMind is actively using Google’s LLMs on groundbreaking research. Anthropic is focused on security for businesses.

For consumers it’s still a better deal for a subscription than to invest a few grand in a personal LLM machine. There will be a time in the future where diminishing returns shortens this gap significantly, but I’m sure top LLM researchers are planning for this and will do whatever they can to keep their firm alive beyond the cost of scaling.

adam_arthur · 2025-11-29T01:02:33 1764378153

Definitely

I am not suggesting these companies can't pivot or monetize elsewhere, but the return on developing a marginally better model in-house does not really justify the cost at this stage.

But to your point, developing research, drugs, security audits or any kind of services are all monetization of the application of the model, not the monetization of the development of new models.

Put more simply, say you develop the best LLM in the world, that's 15% better than peers on release at the cost of $5B. What is that same model/asset worth 1 year later when it performs at 85% of the latest LLM?

Already any 2023 and perhaps even 2024 vintage model is dead in the water and close to 0 value.

What is a best in class model built in 2025 going to be worth in 2026?

The asset is effectively 100% depreciated within a single year.

(Though I'm open to the idea that the results from past training runs can be reused for future models. This would certainly change the math)

wyre · 2025-11-29T03:22:37 1764386557

For sure, all these companies are racing to have the strongest model, and as time goes on we quickly start reaching diminishing returns. DeepSeek came out at the beginning of this year, blew everyone's minds, and now look at how far the industry has progressed beyond it.

It doesn't even seem like these companies are in a battle of attrition to not be the first to go bankrupt. Watching this would be a lot more exciting if that was the case! I think if there was less competition between LLMs developers could slow down, maybe.

Looking at the prices of inference of open-source models, I would bet proprietary models are making a nice margin on API fees, but there is no way OpenAI will make their investors whole because they make a few dollars of revenue for a million tokens. I am terrified of the world we will live in if OpenAI will be able to reverse their balance sheet. I think there's no where else that investors want to put their money.

TheRoque · 2025-11-29T01:13:46 1764378826

The other nightmare for these companies, is that any competitor can use their state of the art model for training another model. As some Chinese models are suspected to do. I personally think it's only fair, since those companies in the first place trained on a ton of data and nobody agreed to it. But it shows that training the frontier models have really low returns on investment

baxtr · 2025-11-30T09:45:18 1764495918

Yes you’re right. Capex spend is definitely higher.

In the end it comes all down to the value provided as you see in the storage example.

adam_arthur · 2025-11-28T16:17:33 1764346653

Probably it's "operationally profitable" when ignoring capex, depreciation, dilution and other required expenses to stay current.

Of course that means it's unprofitable in practice/GAAP terms.

You'd have to have a pretty big margin on inference to make up for the model development costs alone.

A 30% margin on inference for a GPU that will last ~7 years will not cut it

adam_arthur · 2025-10-17T01:09:19 1760663359

I'm confused by all the takes implying decode is more important than prefill.

There are an enormous number of use cases where the prompt is large and the expected output is small.

E.g. providing data for the LLM to analyze, after which it gives a simple yes/no Boolean response. Or selecting a single enum value from a set.

This pattern seems far more valuable in practice, than the common and lazy open ended chat style implementations (lazy from a product perspective).

Obviously decode will be important for code generation or search, but that's such a small set of possible applications, and you'll probably always do better being on the latest models in the cloud.