Hacker Newsnew | past | comments | ask | show | jobs | submit | tiffanyh's commentslogin

Does this vindicate Destin from Smarter Every Day?

2-years ago he presented concerns to NASA.

https://youtu.be/OoJsPvmFixU


No it doesn't. Because literally anybody who knows anything about NASA and follows the Space industry in detail has known about most of the issues since 2015 or even in 2011 when this whole new Post-Constellation shit-show started. And many of the problems have been talked about since the day NASA created Artemis. Destin is just more famous then many of the people in nerd forums.

Destin analysis is ok and he makes a number of good points, but it very pro-Alabama (Mafia) inside NASA and contractors since he very clearly is influence by the strong Albama presence and those are the parts of the industry he interacts with.

So Destin misses a huge amount of the relevant puzzle pieces, or he simply doesn't talk about them.

He also simple makes a few assumptions that are fundamentally wrong, namely the different targets of the program. The goal was never to repeat Apollo and landing a few people a few times is totally different from the original goals of Artemis.


Had the same thought. NASA already cracked this nut with Apollo; if you’re gonna crack it again and differently, be real sure your solution is better.

Keeping track of the different AI product names is so confusing even from a single company.

Why can't Google, for example just call:

  Gemini Image = Nano Banana
  Gemini Video = Veo
  ...

Let alone that Nano Banana 2 is Gemini Image 3.1

Aren’t these optimizations less about PHP, and more about optimizing how your using the database.

PHP is kind of like C. It can be very fast if you do things right, and it gives you more than enough rope to tie yourself in knots.

Making your application fast is less about tuning your runtime and more about carefully selecting what you do at runtime.

Runtime choice does still matter, an environment where you can reasonably separate sending database queries and receiving the result (async communication) or otherwise lets you pipeline requests will tend to have higher throughput, if used appropriately, batching queries can narrow the gap though. Languages with easy parallelism can make individual requests faster at least while you have available resources. Etc.

A lot of popular PHP programs and frameworks start by spending lots of time assembling a beautiful sculpture of objects that will be thrown away at the end of the request. Almost everything is going to be thrown away at the end of the request; making your garbage beautiful doesn't usually help performance.


Would love to read more stories by you toast0 on things you've optimized in the past (given the huge scale you've worked on). Lessons learned, etc. I always find your comments super interesting :)

<3 I always love seeing your comments and questions, too!

Well on the subject of PHP, I think I've got a nice story.

The more recent one is about Wordpress. One day, I had this conversation:

Boss: "will the blog stay up?"

toast0: "yeah, nobody goes to the blog, it's no big deal"

Boss: "they will"

toast0: "oh, ummmm we can serve a static index.html and that should work"

Later that day, he posted https://blog.whatsapp.com/facebook I took a snapshot to serve as index.html and the blog stayed up. A few months later, I had a good reason to tear out WordPress (which I had been wanting to do for a long time), so I spent a week and made FakePress which only did exactly what we needed and could serve our very exciting blog posts in something like 10-20 ms per page view instead of whatever WordPress took (which was especially not very fast if you hit a www server that wasn't in the same colo as our database servers). That worked pretty well, until the blog was rewritten to run on the FB stack --- page weight doubled, but since it was served by the FB CDN, load time stayed about the same. The process to create and translate blog entries was completely different, and the RSS was non-compliant: I didn't want to include a time with the date, and there is/was no available timeless date field in any of the RSS specs, so I just left the time out ... but it was sooo much nicer to run.

Sadly, I haven't been doing any large scale optimization stuff lately. My work stuff doesn't scale much at the moment, and personal small scale fun things include polishing up my crazierl [1] demo (will update the published demo in the next few days or email me for the release candidate url), added IPv6 to my Path MTU Discovery Test [2] since I have somewhere to run IPv6 at MTU 1500, and I wrote memdisk_uefi [3], which is like Syslinux's MEMDISK but in UEFI. My goal with memdisk_uefi is to get FreeBSD's installer images to be usable with PXE in UEFI ... as of FreeBSD 15.0, in BIOS mode you can use PXE and MEMDISK to boot an installer image; but UEFI is elusive --- I got some feedback from FreeBSD suggesting a different approach than what I have, but I haven't had time to work on that; hopefully soonish. Oh and my Vanagon doesn't want to run anymore ... but it's cold out and I don't seem to want to follow the steps in the fuel system diagnosis, so that's not progressing much... I did get a back seat in good shape though so now it can carry 5 people nowhere instead of only two (caveat: I don't have seat belts for the rear passengers, which would be unsafe if the van was running)

[1] https://crazierl.org/

[2] http://pmtud.enslaves.us/

[3] https://github.com/russor/memdisk_uefi


Re: PHP vs a rendered index.html … your story brings back fond memories of my college days (around 2001–2002).

I was a full-time student but also worked for the university’s “internet group.” We ran a homegrown PHP CMS (this was before WordPress/Movable Type), and PHP still felt pretty new. Perl was everywhere, but I was pushing PHP because I’d heard Yahoo had started using it.

Around then, the university launched its first online class registration system. Before that it was all phone/IVR. I warned our team lead the web server would melt down on registration day because every student would be hammering refresh at 9am to get the best class times and professors. He brushed it off, so I pre-rendered the login page as a static index.html and dropped it in the web root.

He noticed, got mad (he had built the CMS and was convinced it could handle the load), and deleted my pre-rendered index.html. So young and dumb me wrote a cron job that pinged the site every few minutes, and if it looked down, it copied my static index.html back into the web directory. Since Apache would serve index.html ahead of PHP, it became an instant fallback page.

Sure enough, at 9am the entire university website went down. Obviously orders of magnitude less scale than your FB story (and way less exciting of an event), but for my small university it was brief moment panic. But my little cron job kicked in and at least kept the front door standing.

While I’m not in active day to day development anymore, I do still work in tech and think a lot about ways to avoid computation. And something I’ve learned a lot from reading your posts over the years and my own personal experiences is just how big you can scale when you can architect in a way that “just pushes bits” (eg “index.html”) as opposed to computes/transforms/renders something … and I’m not sure you can ever really learn that expect through real world experience.

Regarding your links, I’ve seen you post about 1 before and have read about it - it looks very cool. I don’t recall seeing 2 or 3 before and look forward to reading more about those. Thanks as always for your insights!


> Regarding your links, I’ve seen you post about 1 before and have read about it - it looks very cool. I don’t recall seeing 2 or 3 before and look forward to reading more about those. Thanks as always for your insights!

So #1 now has dist connection stuff as of a few hours ago. Not super obvious, but you can load two (or more) nodes and call nodes() and see they're connected. Dist connection opens up lots of neat possibilities... but I do need to add an obvious application so it's like actually neat instead of just potentially neat.

#2 is a pretty neat way to diagnose path mtu problems. And I've been seeing people use it and link to it on networking forums all over, even forums in other languages. Which is pretty awesome. Maybe a few links in forums over the past year, but it's always cool to see people using stuff I built mostly for me. :)

#3 is like I dunno, probably not that useful, I think you could do a lot of similar stuff already, but it felt like a tool that was missing... but I also got some feedback that maybe there's other ways to do it already too, so shrug. But pxe booting is always fun.


It's still valid as as example to the language community of how to apply these optimizations.

in all my years doing database tuning/admin/reliability/etc, performance have overwhelmingly been in the bad query/bad data pattern categories. the data platform is rarely the issue

The worst offenders I've seen were looping over a shitty ORM

hey don’t forget, that shitty ORM also empowers you to write beautiful, fluent code that, under the hood, generates a 12-way join that brings down your entire database.

And that is true across languages.

Isn’t this just normal KYC (for account opening).

What am I missing?

https://withpersona.com/customers/openai


There's nothing normal about it.

Lua is designed for the use case of being embedded.

> "Lua: an extensible embedded language"

https://www.lua.org/ddj.html


In enterprise software, this is an embedded/OEM use case.

And historically, embedded/OEM use cases always have different pricing models for a variety of reasons why.

How is this any different than this long established practice?


It's not, but do you really think the people having Claude build wrappers around Claude were ever aware of how services like this are typically offered.

Super interesting work.

Q: how is your AAP different than the industry work happening on Intent/Instructions.


The short version: instructions tell the model what to do. An Alignment Card declares what the agent committed to do — and then a separate system verifies it actually did.

Most intent/instruction work (system prompts, Model Spec, tool-use policies) is input-side. You're shaping behavior by telling the model "here are your rules." That's important and necessary. But it's unverifiable — you have no way to confirm the model followed the instructions, partially followed them, or quietly ignored them.

AAP is an output-side verification infrastructure. The Alignment Card is a schema-validated behavioral contract: permitted actions, forbidden actions, escalation triggers, values. Machine-readable, not just LLM-readable. Then AIP reads the agent's reasoning between every action and compares it to that contract. Different system, different model, independent judgment.

Bonus: if you run through our gateway (smoltbot), it can nudge the agent back on course in real time — not just detect the drift, but correct it.

So they're complementary. Use whatever instruction framework you want to shape the agent's behavior. AAP/AIP sits alongside and answers the question instructions can't: "did it actually comply?"


> Then AIP reads the agent's reasoning between every action and compares it to that contract.

How would this work? Is one LLM used to “read” (and verify) another LLMs reasoning?


Yep... fair question.

So AIP and AAP are protocols. You can implement them in a variety of ways.

They're implemented on our infrastructure via smoltbot, which is a hosted (or self-hosted) gateway that proxies LLM calls.

For AAP it's a sidecar observer running on a schedule. Zero drag on the model performance.

For AIP, it's an inline conscience observer and a nudge-based enforcement step that monitors the agent's thinking blocks. ~1 second latency penalty - worth it when you must have trust.

For both, they use Haiku-class models for intent summarization; actual verification is via the protocols.


Dumb question: don’t you eventually need a way to monitor the monitoring agent?

If a second LLM is supposed to verify the primary agent’s intent/instructions, how do we know that verifier is actually doing what it was told to do?


Not a dumb question — it's the right one. "Who watches the watchmen" has been on my mind from the start of this.

Today the answer is two layers:

The integrity check isn't an LLM deciding if it "feels" like the agent behaved. An LLM does the analysis, but the verdict comes from checkIntegrity() — deterministic rule evaluation against the Alignment Card. The rules are code, not prompts. Auditable.

Cryptographic attestation. Every integrity check produces a signed certificate: SHA-256 input commitments, Ed25519 signature, tamper-evident hash chain, Merkle inclusion proof. Modify or delete a verdict after the fact, and the math breaks.

Tomorrow I'm shipping interactive visualizations for all of this — certificate explorer, hash chain with tamper simulation, Merkle tree with inclusion proof highlighting, and a live verification demo that runs Ed25519 verification in your browser. You'll be able to see and verify the cryptography yourself at mnemom.ai/showcase.

And I'm close to shipping a third layer that removes the need to trust the verifier entirely. Think: mathematically proving the verdict was honestly derived, not just signed. Stay tuned.


Appreciate all you’re doing in this area. Wishing you the best.

You're welcome - and thanks for that. Makes up for the large time blocks away from the family. It does feel like potentially the most important work of my career. Would love your feedback once the new showcase is up. Will be tomorrow - preflighting it now.

Just something to keep in mind…

Testifying before Congress is brutally stressful. Even the most prepared CEOs can freeze up, lose their train of thought, or misspeak under that kind of pressure.

And the media often hunts for “gotcha” lines without acknowledging how easy it is to make an unintentional misstatement in that setting.

Note: I’m not weighing in on whether Zuckerberg’s statements were accurate ... I’m just pointing out the pressure dynamic that often gets overlooked.


He also gets the best training money can buy and CEOs make enough to justify them giving a perfect presentation. It's not like you ask some lone employee to do it. The proverbial buck stops somewhere.

He also gets the softest questions money can buy, because he's bought practically every congressperson.

I really like @mitchellh perspective on this topic of moving off GitHub.

---

> If you're a code forge competing with GitHub and you look anything like GitHub then you've already lost. GitHub was the best solution for 2010. [0]

> Using GitHub as an example but all forges are similar so not singling them out here This page is mostly useless. [1]

> The default source view ... should be something like this: https://haskellforall.com/2026/02/browse-code-by-meaning [2]

[0] https://x.com/mitchellh/status/2023502586440282256#m

[1] https://x.com/mitchellh/status/2023499685764456455#m

[2] https://x.com/mitchellh/status/2023497187288907916#m


Person who pays for AI: We should make everything revolve around the thing I pay for

The amount of inference required for semantic grouping is small enough to run locally. It can even be zero if semantic tagging is done manually by authors, reviewers, and just readers.

Where did "AI for inference" and "semantic tagging" come from in this discussion? Typically for code repositories - AIs/LLMs are doing reviews/tests/etc, not sure what/where semantic tagging fits? Even do be done manually by humans.

And besides that - have you tried/tested "the amount of inference required for semantic grouping is small enough to run locally."?

While you can definitely run local inference on GPUs [even ~6 years old GPUs and it would not be slow]. Using normal CPUs it's pretty annoyingly slow (and takes up 100% of all CPU cores). Supposedly unified memory (Strix Halo and such) make it faster than ordinary CPU - but it's still (much) slower than GPU.

I don't have Strix Halo or that type of unified memory Mac to test that specifically, so that part is an inference I got from an LLM, and what the Internet/benchmarks are saying.


The stuff he says in [1] completely does not match my usage. I absolutely do use fork and star. I use release. I use the homepage link, and read the short description.

I'm also quite used to the GitHub layout and so have a very easy time using Codeberg and such.

I am definitely willing to believe that there are better ways to do this stuff, but it'll be hard to attract detractors if it causes friction, and unfamiliarity causes friction.


I really don't get this... like you're a code checkout away from just asking claude locally. I get that it is a bit more extra friction but "you should have an agent prompt on your forge's page" is a _huge_ costly ask!

I say this as someone who does browse the web view for repos a lot, so I get the niceness of browsing online... but even then sometimes I'm just checking out a repo cuz ripgrep locally works better.


This looks like a confusing mess to me.

for [1] he's right for his specific use case

when he's working on his own project, obviously he never uses the about section or releases

but if you're exploring projects, you do

(though I agree for the tree view is bad for everyone)


I also check for the License of a project when I'm looking at a project for the first time. I usually only look at that information once, but it should be easily viewed.

I also look for releases if it's a program I want to install... much easier to download a processed artifact than pull the project and build it myself.

But, I think I'm coming around to the idea that we might need to rethink what the point of the repository is for outside users. There's a big difference in the needs of internal and external users, and perhaps it's time for some new ideas.

(I mean, it's been 18 years since Github was founded, we're due for a shakeup)


Hrm. Mitchell has been very level-headed about AI tools, but this seems like a rare overstep into hype territory.

"This new thing that hasn't been shipped, tested, proven, in a public capacity on real projects should be the default experience going forwards" is a bit much.

I for one wouldn't prefer a pre-chewed machine analysis. That sounds like an interesting feature to explore, but why does it need to be forced into the spotlight?



Oh FFS. Twitter really brings out the worst in people. Prefer the more deeply insightful and measured blog posting persona.

Aren't they literally moving off GitHub _because_ of LLMs and the enshittification optimising for them causes? This line of thinking and these features seem to push people _off_ your platform, not onto it.

https://minifeed.net is another similar site that I’ve enjoyed.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: