Hacker Newsnew | past | comments | ask | show | jobs | submit | ilaksh's commentslogin

Great article. If you narrow the audience to be a more discerning group then you could go further with some predictions incorporating robotics that will also displace blue-collar workers.

Integration of intelligence into humanoid robots is rapidly improving. Some indicators: multiple recent demos of learning from human demonstrations or from video, doing household tasks like putting dishes in the dishwasher and folding clothes, dramatic adaptive acrobatic performances, etc.

We have to anticipate that within the next couple of years, general purpose intelligence becomes standard in humanoid robots. And so a similar story about blue collar work could be written.


This is really hopeful and I agree with a lot of the prediction. The problem is that the number of humans needed to produce video games and movies etc. will be 10 or 100 times less.

And since human attention can only spread out over so many different entertainment items, there will not be nearly enough opportunities for all of the humans. Even if many convert to AI and robotics enhanced entrepreneurship.

I actually think that this can work out if we just assume humans have some value and right to live, identify the actual humans, track resources a bit better, and make sure enough robots are employed to maintaining key resources for humans like food

But that only can happen if decision makers actually agree that all humans have value and are willing to figure out how to make that assumption globally and fairly.


I guess you are assuming that all human attention spans are highly correlated, and all want to consume the same Marvel movies, so that only the people who work on Marvel movies are employed.

No, I am just assuming that the number of popular movies is not going to increase by 100 X. So say it increases by a factor of 10 somehow. That still requires far fewer humans to produce them and leaves most people without a job or a viable business. They can make movies, but the size of the audience isn't going to be enough to make a living for most of them.

Right, but why the heck would you guess 100 years when we could build and adopt that in less than two weeks? There are already many people working on this type thing. Some of them have been working on it for years and a few probably already have solutions ready to go or even in use.

I was using 100 years as a way to handwave the timeframe to emphasize that this will happen some time in the future.

Who is actually trying to use a fully autonomous AI employee right now?

Isn't everyone using agentic copilots or workflows with agent loops in them?

It seems that they are arguing against doing something that almost no one is doing yet.

But actually the AI Employee is coming by the end of 2026 and the fully autonomous AI Company in 2027 sometime.

Many people have been working on versions of these things for awhile. But again for actual work 99% are using copilots or workflows with well-defined agent loops nodes still. Far as I know.

As a side note I have found that a supervisor agent with a checklist can fire off subtasks and that works about as well as a workflow defined in code.

But anyway, what's holding back the AI Employee are things like really effective long term context and memory management and some level of interface generality like browser or computer use and voice. Computer use makes context management even more difficult. And another aspect is token cost.

But I assume within the next 9 months or so, more and more people will be figuring out how to build agents that write their own workflows, manage their own limited context and memory effectively across Zoom meetings desktops and ssh sessions, etc.

This will likely be a featureset from the model providers themselves. Actually it may leverage continual learning abilities baked into the model architecture itself. I doubt that is a full year away.


> the AI Employee is coming by the end of 2026 and the fully autonomous AI Company in 2027 sometime

We'll see! I'm skeptical.

> what's holding back the AI Employee are things like really effective long term context and memory management and some level of interface generality like browser or computer use and voice

These are pretty big hurdles. Assuming they're solved by the end of this year is a big assumption to make.



Coincidentally, Pika just launched "AI Selves":

Pika AI Selves let you create a persistent, portable AI version of you built on your personality, taste, memories, voice, and appearance. They're multi-modal – text, voice/audio, image, video – and live your life across every platform.


Funny you described everything I worked on for this project: https://github.com/rush86999/atom

Cats out of the bag. Everyone knows the issue and I bet a lot of people are trying to deliver the same thing.


I think you're forgetting about accountability: who's to blame when AI messes up?

My guess is we'll see a gradual slope rather than a cliff

I check every day for a new full-duplex model. I was so hyped about PersonaPlex from their demos, but in my test it was oddly dumb and unable to follow instructions.

So I am hoping for something like PersonaPlex but a bit larger.

Has anyone tested MiniCPM-o?.How is it at instruction following?


It's actively under development. Do you have a particular use-case in mind?

Outgoing phone calls.

Its interesting that this exact use case is already covered in their ToS. I wonder when the first YouTube as storage project came out, and how many there have been over the years.

The idea of exploiting someone else's server to store files is incredibly old.

https://en.wikipedia.org/wiki/GMail_Drive

When Google launched Gmail (2004) with a huge 1GB storage quota, Richard Jones released GMailFS to mount a Gmail account as a standard block device.


At-least as far back as 2017 when I wrote Schillsaver: https://github.com/Valkryst/Schillsaver

None of us, in the original discussion threads, knew of it being done before then IIRC.


I mean, it is pretty likely they figured out it could be a pretty obvious possible misuse before anyone actually started doing it.

I wonder if providers like Hetzner and Digital Ocean etc. will get this someday also.

DO has Nested Virtualization enabled for years.

The ambiguity in the title is going to get a lot of the "skeptics" who have remained in denial about this to assume it's some kind of admission that they haven't been autonomous this whole time.

It's weird how many people there are like that still.

But what they mean is that they are putting the new release into production (without backup drivers). They have been fully autonomous for many years.


Probably to try to assuage people who already saw this story circulating: https://www.autoblog.com/news/waymo-uses-remote-workers-in-t...

Or perhaps those who saw this blog post by Waymo itself:

Fleet response: Lending a helpful hand to Waymo’s autonomously driven vehicles

Much like phone-a-friend, when the Waymo vehicle encounters a particular situation on the road, the autonomous driver can reach out to a human fleet response agent for additional information to contextualize its environment. The Waymo Driver does not rely solely on the inputs it receives from the fleet response agent and it is in control of the vehicle at all times. As the Waymo Driver waits for input from fleet response, and even after receiving it, the Waymo Driver continues using available information to inform its decisions. This is important because, given the dynamic conditions on the road, the environment around the car can change, which either remedies the situation or influences how the Waymo Driver should proceed. In fact, the vast majority of such situations are resolved, without assistance, by the Waymo Driver.

https://waymo.com/blog/2024/05/fleet-response/

In other words, much like Waymo tries to put a nice spin on it, their cars are not fully autonomous and despite the wording of the article above, they are not "operating a fully autonomous service". Nor can the Waymo Driver "confidently navigate the "long tail" of one-in-a-million events" it "regularly encounter[s] when driving millions of miles a week".

They have remote safety drivers. Not fully autonomous. "Fully autonomous" is their aspiration marketing, but not their current reality.


>They have remote safety drivers. Not fully autonomous. "Fully autonomous" is their aspiration marketing, but not their current reality.

1. They're not "safety drivers" in the sense that most people understand, ie. someone dedicated to watching the car

2. What's with the fixation on defining "fully autonomous" to mean 0% human intervention ever? If a vending machine works 99% of the time, and 1% of the time needs some technician to come to get a drink unstuck does it make sense to get up and arms about how it's not "fully automated" or whatever? In all contexts why people would care (eg. unit economics, safety, customer experience), there's no meaningful difference between 99% autonomous and 100% autonomous.


>> What's with the fixation on defining "fully autonomous" to mean 0% human intervention ever?

Yeah, good point. If Waymo were honest they'd say their system is "autonomous". Fully autonomous implies 100% autonomy. Otherwise, how is it "fully"?

But, hey, don't ask me. Write a paper with robot that is 99% autonomous but a human has to take control every once in a while and see how easy you can get that past any reviewer in robotics or AI.


Come on, you know what the fixation is. Nothing riles up the Tesla fanboys like the clear unambiguous fact that Waymo is doing 1000x better at “full self driving” than Tesla ever has.

Oh dear. You sussed me out, didn't you?

It's like that time with Facebook and MySpace. A while ago now. I was in a student group at uni and this student, call her Alice, asked me for my Facebook. I said I don't have one, I don't like Facebook, and the conversation continued. Later another student came in, call him Bob. Alice told Bob "Where were you, we just had a big fight about Facebook over Myspace". I asked when that happened since I was there and didn't remember it and Alice said "that was me and you. We had a big fight about it. Did you forget?". I said, nonplussed, that I didn't think we had a fight. "But you said you don't like Facebook. So you like MySpace". Said Alice. Oh Alice.

From that I understand that you, like Alice, must be a very astute observer of human behaviour. No hidden motive stays hidden for long, with you, does it? Well done. You got me. I'm a Tesla fanboi. That's what I am, through and through.


They don’t have remote drivers. Your own link says that.

> The Waymo Driver does not rely solely on the inputs it receives from the fleet response agent and it is in control of the vehicle at all times.

> The Waymo Driver evaluates the input from fleet response and independently remains in control of driving.


Pay close attention to the wording: "The Waymo Driver ... remains in control of driving". That means it applies the controls needed to go from point A to point B on its own. However, it does not choose point A and point B on its own: a human chooses them. That's autonomous path planning, but not autonomous navigation, and certainly not "fully autonomous" anything.

Waymo prevaricates about the "influence" the human operator has on the path taken by the Waymo Driver [1] but it is clear there are situations that the Waymo Driver cannot choose point A and point B on its own, at least safely, otherwise Waymo would not be paying for humans to do it. They'd let the system do it on its own. It can't. It's not "fully autonomous".

We can play with words and accept whatever terminological obfuscation Waymo wants to impose in order to pimp its wares, or we can accept that current systems have limitations, and choose to understand the real SOTA over marketing.

_____________

[1] Fleet response can influence the Waymo Driver's path, whether indirectly through indicating lane closures, explicitly requesting the AV use a particular lane, or, in the most complex scenarios, explicitly proposing a path for the vehicle to consider idib.


> one-in-a-million events

So we just made driving a million times more efficient for human labor input


The simple reality with remote controllers is that there is a lot of extra network latency that pretty much makes any real time controlling of the car impractical. Most of this stuff happens over mobile networks too so there might be package loss, low resolution video. Maybe the video freezes for a second or two occasionally. Etc.

Even if human controllers actually could pay attention 100% of the time, they'd struggle to respond in time to a lot of dangerous situations. Most accidents happen when one of the cars (or their drivers) fails to react in time with what is effectively a split second decision.

Autonomous driving (with or without a controller) means a computer takes essentially all of those decisions for the simple reason that any human controller would probably be too late way too often.

Once you accept that simple logical reality, the role of that controller becomes more clear: they are there to step in and provide instructions to the car when it encounters some challenging situation and slows down, pulls over, or stops in a safe place. This probably doesn't involve any joysticks or steering wheels.

Controller responses are not real time critical. They can't be. It would fail to work too often. Also, most controllers probably need to monitor more than one car. Which only makes the problem worse. And they might have to juggle two stuck cars at once.

Mostly autonomous cars are pretty good at object detection and not crashing into stuff (all the real time stuff). It's object classification and interpreting complex situations where cars get stuck or might sometimes still do dangerous/illegal/sub optimal things. Getting stuck or slowing down is fine. The human controller can fix that. Doing the wrong thing is more problematic.


I'm a skeptic, because Self Driving is sold as a digital chauffeur.

Not 99% of a chauffeur, 100%. (or 99.99999%)

The roll out of this is clearly limited by the number of remote employees that are filling in the 1%.


> They have been fully autonomous for many years.

In this very thread plenty of people are saying that what Tesla are doing now in Austin is NOT fully autonomous, but you assert Waymo doing the same for many years is?

Waymo had remote operators who could take over when needed for a long time.


They cannot take over. They can give advice in unusual circumstances.

I don't have to argue about Tesla or what other people are saying about Tesla.


These days they can’t take over, but they sure could years back.

I have my own MIT licensed framework/UI: https://github.com/runvnc/mindroot. With Nano Banana via runvnc/googleimageedit

This is a good idea. Do you use something like browser-use or Fara-7b behind the scenes? Or maybe you don't want to give up your secrets (which is fine if that's the case).


Thanks for asking! We developed our browser agent that uses a mix of custom and frontier models for different parts of the system


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: