5.5's "it's not X, it's Y."
GPT-5.5 is smart but it's misinterpreting simple questions and answering with over-engineered pipelines wrapped in "it's not X, it's Y."
okay. gpt-5.5 is nice. it has a lot going for it. but this pattern:
kind of ruins the whole experience of using a smart model. it's clearly smart. but then it does this weird know-it-all reframe and makes llama 3-level mistakes because it misunderstood the question.
get this:
i asked: "what is the best solution for building a Google Maps Scraper (for finding local businesses) - i need it as a part of a system i'm building, but don't wanna overpay for Outscraper."
do you want to guess what it said?
are you kidding me? i asked for a scraper. not a whole fake startup architecture.
and look how it opens, with this little throat-clear: "I'll treat this as: I need reliable local-business lead extraction, good DX, not legacy scraper pain, and not Outscraper-priced." no you won't. don't "treat this as" anything. i asked for a tool.
the part that sucks
the annoying part is that i actually want to like this model. it has a lot going for it:
- it's really token efficient
- it's smart by itself, doesn't need a lot of reasoning to get there
- when you tell it to do a thing, it usually just does the thing
and then it all fucking falls down on itself. you ask one clean question. it invents a different question. it answers that one instead. and somehow it leaves you feeling slightly stupid for asking the original.
i don't think this is the system prompt. if it was, they would've fixed it by now. this feels more like pretraining or rlhf or some deeper baked-in helpfulness disease that nobody wants to rip back out.
also it's kind of condescending
"it's not X, it's Y" reads like a soft correction, like the model is politely informing you that you misunderstood your own problem. and then it freezes at the end of the message with this little victorious tone like "ah. cute. you asked for X, but what you actually needed was Y."
and that's the part that gets under my skin. i wasn't wrong. i know what i asked. i wanted an answer to that question, not the one you invented halfway through the reply.
what i want from 5.6
- answer the question i asked. literally that one.
- if you think i'm asking the wrong question, ask me. don't reframe and run.
- if you're going to be smart, be smart at the thing i pointed at.
- retire "it's not X, it's Y" entirely. nobody is going to miss it.
that's it. keep the good parts. keep the token efficiency. keep the it-just-does-the-thing reflex. keep the not-needing-to-think-for-9-minutes thing. lose the reframe.
the phrase is only part of it
after posting this i reran it with a bigger set: 70 prompts across 5 conditions — minimal asks, longer asks with the same intent, an "answer literally" sysprompt, format-constrained variants, and prompts where the preference was already inside the question.
| test | result | read |
|---|---|---|
| same ask, just written longer | +122% | make the prompt longer and the answer gets way longer too |
same ask, but with answer literally | −79% | one blunt instruction cuts most of the sprawl immediately |
| exact phrase: "it's not x, it's y" | 1/70 | the literal sentence barely showed up |
| same move, no phrase | 30/70 | it still quietly reframed the ask and answered a different one |
| no reframe | 39/70 | it just answered the thing i actually asked |
this is the split i actually cared about. the exact phrase is not the whole bug. it's just the most annoying version of the bug.
because most of the time it doesn't literally say "it's not x, it's y." it just does the move. it quietly upgrades your question into the more elaborate question it thinks you should have asked, then answers that instead.
and the length effect is very real. make the prompt longer and the output gets dramatically longer too. not in a clean proportional way. it starts pulling in framing, caveats, alternatives, best practices, and all this preemptive extra shit you did not ask for.
i even asked why it does that and got the exact answer you'd expect: trying to be helpful, trying to cover gotchas, trying to preempt follow-ups. yes. that's the problem.
tl;dr: if you want a simple answer, you have to send a simple prompt. the longer your prompt is, the longer the output gets, and it does not scale linearly.