Hacker Newsnew | past | comments | ask | show | jobs | submit | dimitri-vs's commentslogin

I think they already are. When I used the prompt with 5.2 it gives very concise and general info but if you use older models (5.1 instant or o3) you get a ton of detail.

I just tried 5.1 and got the exact same output as for 5.2 (actually I got slightly less info with 5.1)

Measuring the behavior of non-deterministic systems requires more than one sample.

I've tried a bunch of them only to settle on using Claude Code with remote control.

As others have said: accountability

What's your budget? https://en.tokyodevices.com/items/128

But seriously you can probably DIY something a lot cheaper.


It's for people that don't know how or don't want to be bothered with setting up a messenger integration and a scheduler.

How would it know you've ran out of milk?

I told it when I noticed. I made a little pendant with a mic I can speak into and it goes to the bot.

Turns out Humane was ahead of its time.

I would love to hear more about this!

I haven't written it up yet but the repo is here:

It's just a MEMS mic, a battery, and an ESP32, very simple but it works amazingly well. I wrote a companion Android app for it and it works extremely reliably!


I really love that. Can't wait for your writeup!

The pendant is almost ready, I'll write it up this week!

Sneak peek: https://imgz.org/i6xDDz6x/


Wrong picture, it’s too small! ;) :D

Thank you, must make one!


I'm going to make it 40% smaller when the small battery arrives! I really have to write the article, but I've been working on my bot all day, which is becoming extremely amazing.

Wow :D puny!

Your projects are amazing, saw your site a couple days ago and just saw your submissions now, love it and thanks!


Thanks, I'm glad you like them!

Are you running NanoClaw or a different project?


Yep, I'm running my own thing (link in sibling), I wanted something secure I could run on my PC.

IMO Copilot was "we need to give these people rope, but not enough for them to hang themselves". A non technical person with no patience and access to a real AI agent inside a business is a bull in a china shop. Copilot Cowork is the closest thing we have to what Copilot should have been and is only possible now because models finally got good enough to be less supervised.

FWIW Gemini inside Google apps is just as bad.


I don't think LLMs are very good at introspection on what they know or don't know, but otherwise this is gold. Thanks for sharing.


API Opus 4.6 will tell you it's still 2025, admit it's wrong then revert back to being convinced it's 2025 as it nears it's context limit.

I'll go so far as to say LLM agents are AGI-lite but saying we "just need the orchestration layer" is like saying ok we have a couple neurons, now we just need the rest of the human.


Giving opus a memory or real-time access to the current year is trivial. I don't see how that's an argument against it being AGI.


Manual orchestration is a brittle crutch IMO - you don't get to the moon by using longer and longer ladders. A powerful model in theory should be able to self orchestrate with basic tools and environment. The thing is that it also might be as expensive as a human to run - from a tokens AND liability perspective.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: