Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yeah, I've been disappointed in GPT-5 for OCR - Gemini 2.5 is much better on that front: https://simonwillison.net/2025/Aug/29/the-perils-of-vibe-cod...


Images in general, nothing comes close to Gemini 2.5 for understanding scene composition. They perform segmentation and so you can even ask for things like masks of arbitrary things or bounding boxes.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: