Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

More great innovation from Google. OpenAI have two major problems.

The first is Google's vertically integrated chip pipeline and deep supply chain and operational knowledge when it comes to creating AI chips and putting them into production. They have a massive cost advantage at every step. This translates into more free services, cheaper paid services, more capabilities due to more affordable compute, and far more growth.

Second problem is data starvation and the unfair advantage that social media has when it comes to a source of continually refreshed knowledge. Now that the foundational model providers have churned through the common crawl and are competing to consume things like video and whatever is left, new data is becoming increasingly valuable as a differentiator, and more importantly, as a provider of sustained value for years to come.

SamA has signaled both of these problems when he made noises about building a fab a while back and is more recently making noises about launching a social media platform off OpenAI. The smart money among his investors know these issues to be fundamental in deciding if OAI will succeed or not, and are asking the hard questions.

If the only answer for both is "we'll build it from scratch", OpenAI is in very big trouble. And it seems that that is the best answer that SamA can come up with. I continue to believe that OpenAI will be the Netscape of the AI revolution.

The win is Google's for the taking, if they can get out of their own way.



Nobody has really talked about what I think is an advantage just as powerful as the custom chips: Google Books. They already won a landmark fair use lawsuit against book publishers, digitized more books than anyone on earth, and used their Captcha service to crowdsource its OCR. They've got the best* legal cover and all of the best sources of human knowledge already there. Then Youtube for video.

The chips of course push them over the top. I don't know how much Deep Research is costing them but it's by far the best experience with AI I've had so far with a generous 20/day rate limit. At this point I must be using up at least 5-10 compute hours a day. Until about a week ago I had almost completely written off Google.

* For what it's worth, I don't know. IANAL


The amount of text in books is surprisingly finite. My best estimate was that there are ~10¹³ tokens available in all books (https://dynomight.net/scaling/#scaling-data), which is less than frontier models are already being trained on. On the other hand, book tokens are probably much "better" than random internet tokens. Wikipedia for example seems to get much higher weight than other sources, and it's only ~3×10¹⁰ tokens.


We need more books! On it…


> And further, by these, my son, be admonished: of making many books there is no end; and much study is a weariness of the flesh.

Ecclesiastes 12:12 ;)


opens up his favorite chat


LibGen already exists, and all the top LLM publishers use it. I don't know if Google's own book index provides a big technical or legal advantage.


I'd be very surprised if the Google books index wasn't much bigger and more diverse than libgen.


Anna's Archive is at 43M Books and 98M Papers [1]. The book total is nearly double what Google has.

Google's scanning project basically stalled after the legal battle. It's a very fascinating read [2].

[1] https://annas-archive.org/

[2] https://web.archive.org/web/20170719004247/https://www.theat...


Something that is not specifically called out but is also super relevant is actually the transcription of YouTube videos.

Every video is machine transcribed and stored and then for larger videos the author will often transcribed them themselves.

This is something they have already, it doesn't need any more "work" to get it vs a competitor.


I would think the biggest advantage is YouTube. There's a lot of modern content for analysis that's uncontaminated by LLMs.


Google has the data and has the hardware, not to mention software and infrastructure talent. Once this Bismarck turns around and it looks like it is, who can parry it for real? They have internet.zip and all the previous versions as well, they have youtube, email, search, books, traffic, maps and business on it, phones and habits around it, even the OG social network, the usenet. It's a sleeping giant starting to wake up and it's already causing commotion, let's see what it does when it drinks morning coffee.


Agreed. One of Google's big advantages is the data access and integrations. They are also positioned really well for the "AI as entertainment" sector with youtube which will be huge (imo). They also have the knowledge in adtech and well injecting adds into AI is an obvious play. As is harvesting AI chat data.

Meta and Google are the long term players to watch as Meta also has similar access (Insta, FB, WhatsApp).


On-demand GenAI could definitely change the meaning of "You" in "Youtube".


They have the Excel spreadsheets of all startups and businesses of the world (well 50/50 with Microsoft).

And Atlassian has all the project data.


More like 5/95 with Microsoft - and that's being generous, I wouldn't be surprised if it was 1/99. It's basicaly just hip tech companies and a couple of Fortune 500s that use Google Docs. And even their finance departments often use Excel. HN keeps underestimating how the whole physical world runs on Excel.


I still can't understand how google missed on github, especially since they were in the same space before with google code. I do understand how they couldn't make a github though.


Another advantage that Google has is the deep integration of Gemini into Google Office products and Gmail. I was part of a pilot group and got to use a pre-release version and it's really powerful and not something that will be easy for OpenAI to match.


Agreed. Once they dial in the training for sheets it's going to be incredible. I'm already using notebooklm to upload finance PDFs, then having it generate tabular data and copypasta into sheets, but it's a garage solution compared to just telling it to create or update a sheet with parsed data from other sheets, PDFs, docs, etc.

And as far as gmail goes, I periodically try to ask it to unsubscribe from everything marketing related, and not from my own company, but it's not even close to being there. I think there will continue to be a gap in the market for more aggressive email integration with AI, given how useless email has become. I know A16Z has invested in a startup working on this. I doubt Gmail will integrate as deep as is possible, so the opportunity will remain.


I frankly am in doubt of future office products. In the last month I have ditched two separate excel productivity templates in favor of bespoke wrappers on sqlite databases, written by Claude and Gemini. Easier to use and probably 10x as fast.

You don't need a 50 function swiss army knife when your pocket can just generate the exact tool you need.


You say deep integration, yet there is still no way to send a Gemini Canvas to Docs without a lot of tedious copy-pasting and formatting because Docs still doesn’t actually support markdown. Gemini in Google Office in general has been a massive disappointment for all but the most simplistic of writing tasks.

They can have the most advanced infrastructure in the world, but it doesn’t mean much if Google continues its infamous floundering approach to product. But hey, 2.5 pro with Cline is pretty nice.


Maybe I'm misunderstanding, but there is literally a Share button in Canvas right below each response with the option to export to Docs. Within Docs, you can also click on the Gemini "star" at the upper right to get a prompt and then also export into the open document. Note that this is a with "experimental" Gemini 2.5 Pro.


Docs supports markdown in comments, where it's the only way to get formatting.

I love Googles product dysfunction sometimes :/


I have access to this now and I want it to work so bad and it's just proper shit. Absolute rubbish.

They really, truly need to fix this integration. Gemini in Google Docs is barely acceptable, it doesn't work at all (for me) in Gmail, and I've not yet had it do anything other than error in Google Sheets.


If the battle was between Altman and Pichai I'd have my doubts.

But the battle is between Altman and Hassabis.

I recall some advice on investment from Buffett regarding how he invests in the management team.


Sorry but my eyes rolled to the back of my head with this one. This is between two teams with tons of smart contributors, but the difference is one is more flexible and able to take risks vs the other that has many times more researchers and the world's best and most mature infrastructure/tooling. Its not a CEO vs CEO battle


I think it requires a nuanced take but allow me to provide some counter-examples.

The first is CEO pay rates. Another is the highest paid public employees (which tend to be coaches at state schools). This is evidence that the market highly values managers.

Another is systemic failures within enterprises. When Boeing had a few very public plane crashes, a certain narrative suggested that the transition from highly capable engineer managers to financial focus managers contributed to the problem. A similar narrative has been used to explain the decline of Intel.

Consider the return of Steve Jobs to Apple. Or the turn around at Microsoft with Nadella.

All of these are complex cases that don't submit to an easy analysis. Success and failure are definitely multi-factor and rarely can be traced to a single definitive cause.

Perhaps another way to look at it would be: what percentage of the success of highly complex organizations can be attributed to management? To what degree can poor management decisions contribute to the failure of an otherwise capable organization?

How much you choose to weight those factors is entirely up to you.

edit: I was also thinking about the way we think about the advantage of exceptional generals/admirals in military analysis. Or the effect a president can have on the direction of a country.


Could you please expand, on both your points?


It is more gut feel than a rational or carefully reasoned argument.

I think Pichai has been an exceptional revenue maximizer but he lacks vision. I think he is probably capable of squeezing tremendous revenue out of AI once it has been achieved.

I like Hassabis in a "good vibe" way when I hear him speak. He reminds me of engineers that I have worked with personally and have gained my respect. He feels less like a product focused leader and more of a research focused leader (AlphaZero/AlphaFold) which I think will be critical to continue the advances necessary to push the envelope. I like his focus on games and his background in RL.

Google's war chest of Ad money gives Hassabis the flexibility to invest in non-revenue generating directions in a way that Altman is unlikely to be able to do. Altman made a decision to pivot the company towards product which led to the exodus of early research talent.


> Altman made a decision to pivot the company towards product which led to the exodus of early research talent.

Who was going to fund the research though?


Fair point, and a good reminder not to pass judgement on the actions of others. It is totally possible that Altman made his own prediction of the future and theorized that the only hope he had of competing with the existing big tech companies to realistically achieve an AI for the masses was to show investors a path to profitability.

I should also give Altman a bit more due in that I find his description of a world augmented by powerful AI to be more inspiring than any similar vision I've heard from Pichai.

But I'm not trying to guess their intentions, I am just stating the situation as I see it. And that situation is one where whatever forces have caused it, OpenAI is clearly investing very heavily in product (e.g. windsurf acquisition, even suggesting building a social network). And that shift in focus seems highly correlated with a loss of significant research talent (as well as a healthy dose of boardroom drama).


Note sure why their comment was downvoted. Google the names. Hassabis runs DeepMind at Google which makes Gemini and he's quite brilliant and has an unbelievable track record. Buffet investing in teams points out that there are smart people out there that think good leadership is a good predictor of future success.


It may not be relevant to everyone, but it is worth noting that his contribution to AlpaFold won Hassabis a Nobel prize in chemistry.


Zoogeny got downvoted? I did not do that. His comments deserved more details anyway (at the level of those kindly provided).

> Google the names

Was that a wink about the submission (a milestone from Google)? Read Zoogeny's delightful reply and see whether it can compare a search engine result (not to mention that I asked for Zoogeny's insight, not for trivia). And as a listener to Buffet and Munger, I can surely say that they rarely indulge in tautologies.


I wouldn't worry about downvotes, it isn't possible on HN to downvote direct replies to your message (unlike reddit), so you cannot be accused of downvoting me unless you did so using an alt.

Some people see tech like they see sports teams and they vote for their tribe without considering any other reason. I'm not shy stating my opinion even when it may invite these kinds of responses.

I do think it is important for people to "do their own research" and not take one man's opinion as fact. I recommend people watch a few videos of Hassabis, there are many, and judge his character and intelligence for themselves. They may find they don't vibe with him and genuinely prefer Altman.


I haven’t heard this much positive sentiment about Google in a while. Making something freely available really turns public sentiment around.


I don't know man, for months now people keep telling me on HN how "Google is winning", yet no normal person I ever asked knows what the fuck "Gemini" is. I don't know what they are winning, it might be internet points for all I know.

Actually, some of the people polled recalled the Google AI efforts by their expert system recommending glue on pizza and smoking in pregnancy. It's a big joke.


Try uploading a bunch of PDF bank statements to notebooklm and ask it questions. Or the results of blood work. It's jaw dropping. e.g. uploaded 7 brokerage account statements as PDFs in a mess of formats and asked it to generate table summary data which it nailed, and then asked it to generate actual trades to go from current position to a new position in shortest path, and it nailed that too.

Biggest issue we have when using notebooklm is a lack of ambition when it comes to the questions we're asking. And the pro version supports up to 300 documements.

Hell, we uploaded the entire Euro Cyber Resilience Act and asked the same questions we were going to ask our big name legal firm, and it nailed every one.

But you actually make a fair point, which I'm seeing too and I find quite exciting. And it's that even among my early adopter and technology minded friends, adoption of the most powerful AI tools is very low. e.g. many of them don't even know that notebookLM exists. My interpretation on this is that it's VERY early days, which is suuuuuper exciting for us builders and innovators here on HN.


That was ages ago.

Their new models excel at many things. Image editing, parsing PDFs, and coding are what I use it for. It's significantly cheaper than the closest competing models (Gemini 2.5 pro, and flash experimental with image generation).

Highly recommend testing against openai and anthropic models - you'll likely be pleasantly surprised.


While there are some first-party B2C applications like chat front-ends built using LLMs, once mature, the end game is almost certainly that these are going to be B2B products integrated into other things. The future here goes a lot further than ChatGPT.


another advantage is people want the Google bot to crawl their pages, unlike most AI companies


Reddit was an interesting case here. They knew that they had particularly good AI training data, and they were able to hold it hostage from the Google crawler, which was an awfully high risk play given how important Google search results are to Reddit ads, but they likely knew that Reddit search results were also really important to Google. I would love to be able to watch those negotiations on each side; what a crazy high stakes negotiation that must've been.


Particularly good training data?

You can't mean the bottom-of-the-barrel dross that people post on Reddit, so not sure what data you are referring to? Click-stream?


Say what you will, but there's a lot of good answers to real questions people have that's on Reddit. There's a whole thing where people say "oh Google search results are bad, but if you append the word 'REDDIT' to your search, you'll get the right answer." You can see that most of these agents rely pretty heavily from stuff they find on Reddit.

Of course, that's also a big reason why Google search results suggest putting glue on pizza.


This is an underrated comment. Yes it's a big advantage and probably a measurable pain point for Anthropic and OpenAI. In fact you could just do a 1% survey of robots.txt out there and get a reasonable picture. Maybe a fun project for an HN'er.


This is right on. I work for a company with somewhat of a data moat and AI aspirations. We spend a lot of time blocking everyone's bots except for Google. We have people whose entire job is it to make it faster for Google to access our data. We exist because Google accesses our data. We can't not let them have it.


Excellent point. If they can figure out how to either remunerate or drive traffic to third parties in conjunction with this, it would be huge.


> The smart money among his investors know these issues to be fundamental in deciding if OAI will succeed or not, and are asking the hard questions.

OpenAI has already succeeded.

If it ends up being a $100B company instead of a $10T company, that is success. By a very large margin.

It's hard to imagine a world in which OpenAI just goes bankrupt and ends up being worth nothing.


I can, and I would say it's a likely scenario, say 30%. If they don't have a significant edge over their competitors in the capabilities of their models, what's left? A money losing web app, and some API services that I'm sure aren't very profitable either. They can't compete with Google, Grok, Meta, MS, Amazon... They just can't.

They can end being the Altavista of this era.


it goes bankrupt when the cost of running the business outweights the earnings in the long run


> If the only answer for both is "we'll build it from scratch", OpenAI is in very big trouble

They could buy Google+ code from Google and resurrect it with OpenAI branding. Alternately they could partner with Bluesky


I don't think the issue is solving the technical implementation of a new social media platform. The issue is whether a new social media platform from OpenAI will deliver the kind of value that existing platforms deliver. If they promise investors that they'll get TikTok/Meta/YouTube levels of content+interaction (and all the data that comes with it), but deliver Mastodon levels, then they are in trouble.


Except that they train their model even when you pay. So yeah.. I'd rather not use their "evil"



Source?


It's right there in the comment.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: