by meetpateltech on 9/25/25, 5:20 PM with 279 comments
by davidmckayv on 9/25/25, 6:28 PM
I've been running into it consistently, responses that just stop mid-sentence, not because of token limits or content filters, but what appears to be a bug in how the model signals completion. It's been documented on their GitHub and dev forums for months as a P2 issue.
The frustrating part is that when you compare a complete Gemini response to Claude or GPT-4, the quality is often quite good. But reliability matters more than peak performance. I'd rather work with a model that consistently delivers complete (if slightly less brilliant) responses than one that gives me half-thoughts I have to constantly prompt to continue.
It's a shame because Google clearly has the underlying tech. But until they fix these basic conversation flow issues, Gemini will keep feeling broken compared to the competition, regardless of how it performs on benchmarks.
https://github.com/googleapis/js-genai/issues/707
https://discuss.ai.google.dev/t/gemini-2-5-pro-incomplete-re...
by simonw on 9/25/25, 6:52 PM
export LLM_GEMINI_KEY='...'
uvx --isolated --with llm-gemini llm -m gemini-flash-lite-latest 'An epic poem about frogs at war with ducks'
Release notes: https://github.com/simonw/llm-gemini/releases/tag/0.26Pelicans: https://github.com/simonw/llm-gemini/issues/104#issuecomment...
by herpderperator on 9/25/25, 10:22 PM
by ashwindharne on 9/25/25, 6:06 PM
It's a delicate balance, because these Gemini models sometimes feel downright lobotomized compared to claude or gpt-5.
by newfocogi on 9/25/25, 5:46 PM
Both models have improved intelligence on Artificial Analysis index with lower end-to-end response time. Also 24% to 50% improved output token efficiency (resulting in lower cost).
Gemini 2.5 Flash-Lite improvements include better instruction following, reduced verbosity, stronger multimodal & translation capabilities. Gemini 2.5 Flash improvements include better agentic tool use and more token-efficient reasoning.
Model strings: gemini-2.5-flash-lite-preview-09-2025 and gemini-2.5-flash-preview-09-2025
by zitterbewegung on 9/25/25, 6:12 PM
by aeon_ai on 9/25/25, 6:04 PM
Something that distinguishes between a completely new pre-training process/architecture, and standard RLHF cycles/optimizations.
by minimaxir on 9/25/25, 6:07 PM
by Liwink on 9/25/25, 5:52 PM
From OpenRouter last week:
* xAI: Grok Code Fast 1: 1.15T
* Anthropic: Claude Sonnet 4: 586B
* Google: Gemini 2.5 Flash: 325B
* Sonoma Sky Alpha: 227B
* Google: Gemini 2.0 Flash: 187B
* DeepSeek: DeepSeek V3.1 (free): 180B
* xAI: Grok 4 Fast (free): 158B
* OpenAI: GPT-4.1 Mini: 157B
* DeepSeek: DeepSeek V3 0324: 142B
by Hobadee on 9/25/25, 11:27 PM
It is HORRENDOUS when compared to other models.
I hear a bunch of other people talking about how great Gemini is, but I've never seen it.
The responses are usually either incorrect, way too long, (essays when I wanted summaries) or just...not...good. I will ask the exact same question to both Gemini and ChatGPT (free) and GPT will give a great answer while the Gemini answer is trash.
Am I missing something?
by fzimmermann89 on 9/25/25, 8:21 PM
by stephen_cagle on 9/25/25, 10:15 PM
by boomer_joe on 9/26/25, 2:26 AM
Would like to know whether Flash exhibits these issues as well.
by OGEnthusiast on 9/25/25, 5:47 PM
by ikgn on 9/26/25, 9:09 AM
by ImPrajyoth on 9/25/25, 6:03 PM
by tardyp on 9/25/25, 6:02 PM
by phartenfeller on 9/25/25, 9:31 PM
by dcchambers on 9/25/25, 6:58 PM
This industry desperately needs a Steve Jobs to bring some sanity to the marketing.
by DoctorOetker on 9/25/25, 11:30 PM
I would like to try a small computer->human "upload" experiment, basic multilingual understanding without pronounciation knowledge would be very sad.
I intend to make a sort of computer reflexive game, I want to compare different upload strategies (with/without analog or classic error correcting codes, empirical spaced repetition constants, a ML predictor of which parameters I'm forgetting / losing resolution on.
by rasz on 9/26/25, 1:53 AM
It kept finding those fatal flaws and starting to explain them to then slowly finish with "oh yes this works as intended".
by ahmedfromtunis on 9/25/25, 6:46 PM
by artur_makly on 9/25/25, 10:46 PM
Gemini 2.5 Flash Preview $0.30 $2.50
Grok 4 Fast $0.20 $0.50
by whinvik on 9/26/25, 2:14 PM
However its hampered by max output tokens. Gemini is at 65 K while GPT 5 mini is at 128K. Both of them have similar costs as well so as such apart from the 1M context limit GPT 5 mini is better in every way.
by rafaelero on 9/26/25, 5:18 AM
by strangescript on 9/26/25, 3:22 AM
by dgemm on 9/26/25, 4:26 PM
by grej on 9/26/25, 3:08 AM
by pier25 on 9/25/25, 9:16 PM
by Moosdijk on 9/26/25, 8:45 AM
The way I have come to perceive AI is that it's mostly good at reassuring/reaffirming people's beliefs and ideas than an actual source of truth.
That would not be an issue if it was actually marketed as such, but seeing the "guided learning" function fail time and again makes me think we should be a lot more critical of what we're being told by tech enthusiasts/companies about AI.
by ChildOfChaos on 9/25/25, 6:06 PM
by rldjbpin on 9/26/25, 1:28 PM
at least for us, the bottleneck is the amount of retries/waiting needed to max out how many requests we can make in parallel.
[1] https://cloud.google.com/vertex-ai/generative-ai/docs/dynami...
by modeless on 9/25/25, 6:56 PM
by maxdo on 9/25/25, 11:52 PM
by user3939382 on 9/26/25, 1:26 AM
by guybedo on 9/26/25, 12:21 AM
Here's a summary of this discussion with the new version: https://extraakt.com/extraakts/the-great-llm-versioning-deba...
by Fiahil on 9/25/25, 6:10 PM
by brap on 9/25/25, 6:05 PM
Flash is super fast, gets straight to the point.
Pro takes ages to even respond, then starts yapping endlessly, usually confuses itself in the process and ends up with a wrong answer.
by scosman on 9/25/25, 5:45 PM
Anthropic learned this lesson. Google, Deepseek, Kimi, OpenAI and others keep repeating it. This feels like Gemini_2.5_final_FINAL_FINAL_v2.
by lysecret on 9/27/25, 2:43 PM
by sreekanth850 on 9/26/25, 3:17 AM
by simianwords on 9/25/25, 7:28 PM
by agluszak on 9/25/25, 8:30 PM
by thrownawayohman on 9/25/25, 9:02 PM
by jama211 on 9/25/25, 7:06 PM
by bogtog on 9/25/25, 6:38 PM
Typo in the first sentence? "... improving the efficiency." Gemini 2.5 Pro says this is perfectly good phrasing, whereas ChatGPT and Claude recognize that it's awkward or just incorrect. Hmm...