Discussion about this post

User's avatar
Performative Bafflement's avatar

Just wanted to say thanks for writing this - Google recently opened 2.5 to the web interface, so now you don't need API calls, and I've been using it thanks to this post.

They *cooked.* It is significantly smarter, more capable, and less error-prone than o1 Pro or Claude 3.7, in my own opinion. It goes deeper in detailed ways, and I haven't run across Gellman Amnesia once in any subject I know deeply. It's been able to go deeper than my knowledge in those areas too, which is a first - and when I double checked, it was right and wasn't hallucinating.

Another advantage - I like to "adversarially collaborate" and test my ideas and arguments, and o1 Pro and Claude 3.7 and all the other models really suck for this - they immediately roll over at the tiniest pushback.

But 2.5 doesn't do this, stakes out a consistent position and maintains it over time, but is amenable to factual correction or rebuttals (but not vibes based ones!) - it's so much smarter than every other model right now, I've made it my daily driver.

And all thanks to this post! I don't think I would have tried it if I hadn't seen your post and Zvi's talking about it. Anyone else reading this, if you haven't tried it, it's available at the Gemini web interface for free - you might be pleasantly surprised, like I was.

Expand full comment
Polici's avatar

Dude. it's 'cause Gemini 25 flash is like a drunk driver...

Expand full comment
11 more comments...

No posts

Ready for more?