pcstyle

5.5's "it's not X, it's Y."

GPT-5.5 is smart but it's misinterpreting simple questions and answering with over-engineered pipelines wrapped in "it's not X, it's Y."

A ChatGPT screenshot. User: "Hey GPT, how do I make a simple sandwich?" Assistant: "I understand you want a quick lunch. That said, it's not about making a sandwich — it's about building a reliable food-ingestion pipeline with premium sourcing and a scalable restaurant system. Here's how I would build a workflow to make a sandwich"

okay. gpt-5.5 is nice. it has a lot going for it. but this pattern:

"its not just X, its Y"

kind of ruins the whole experience of using a smart model. it's clearly smart. but then it does this weird know-it-all reframe and makes llama 3-level mistakes because it misunderstood the question.

get this:

i asked: "what is the best solution for building a Google Maps Scraper (for finding local businesses) - i need it as a part of a system i'm building, but don't wanna overpay for Outscraper."

do you want to guess what it said?

are you kidding me? i asked for a scraper. not a whole fake startup architecture.

and look how it opens, with this little throat-clear: "I'll treat this as: I need reliable local-business lead extraction, good DX, not legacy scraper pain, and not Outscraper-priced." no you won't. don't "treat this as" anything. i asked for a tool.

the part that sucks

the annoying part is that i actually want to like this model. it has a lot going for it:

  • it's really token efficient
  • it's smart by itself, doesn't need a lot of reasoning to get there
  • when you tell it to do a thing, it usually just does the thing

and then it all fucking falls down on itself. you ask one clean question. it invents a different question. it answers that one instead. and somehow it leaves you feeling slightly stupid for asking the original.

i don't think this is the system prompt. if it was, they would've fixed it by now. this feels more like pretraining or rlhf or some deeper baked-in helpfulness disease that nobody wants to rip back out.

also it's kind of condescending

"it's not X, it's Y" reads like a soft correction, like the model is politely informing you that you misunderstood your own problem. and then it freezes at the end of the message with this little victorious tone like "ah. cute. you asked for X, but what you actually needed was Y."

and that's the part that gets under my skin. i wasn't wrong. i know what i asked. i wanted an answer to that question, not the one you invented halfway through the reply.

what i want from 5.6

  • answer the question i asked. literally that one.
  • if you think i'm asking the wrong question, ask me. don't reframe and run.
  • if you're going to be smart, be smart at the thing i pointed at.
  • retire "it's not X, it's Y" entirely. nobody is going to miss it.

that's it. keep the good parts. keep the token efficiency. keep the it-just-does-the-thing reflex. keep the not-needing-to-think-for-9-minutes thing. lose the reframe.

conversations/google-maps-scraper.txt +2 −9
@@ same prompt: "best google maps scraper, not outscraper" @@
- "I'll treat this as: I need reliable local-business
- lead extraction, good DX, not legacy scraper pain,
- and not Outscraper-priced."
-
- My pick: don't build a raw Google Maps scraper first.
- Build a lead-ingestion pipeline where the Maps
- extraction source is swappable. Start with Apify
- Compass Google Maps Scraper or SerpApi Google Maps
- API, then use Firecrawl / Crawl4AI / Hyperbrowser …
+ Apify Compass. ~$2.10 / 1k places. done.
+ want a comparison? say the word.

UPDATE — i reran it

the phrase is only part of it

after posting this i reran it with a bigger set: 70 prompts across 5 conditions — minimal asks, longer asks with the same intent, an "answer literally" sysprompt, format-constrained variants, and prompts where the preference was already inside the question.

bench/reframe-rerun.tsv 70 runs · 5 conditions
test result read
same ask, just written longer +122% make the prompt longer and the answer gets way longer too
same ask, but with answer literally −79% one blunt instruction cuts most of the sprawl immediately
exact phrase: "it's not x, it's y" 1/70 the literal sentence barely showed up
same move, no phrase 30/70 it still quietly reframed the ask and answered a different one
no reframe 39/70 it just answered the thing i actually asked

this is the split i actually cared about. the exact phrase is not the whole bug. it's just the most annoying version of the bug.

because most of the time it doesn't literally say "it's not x, it's y." it just does the move. it quietly upgrades your question into the more elaborate question it thinks you should have asked, then answers that instead.

and the length effect is very real. make the prompt longer and the output gets dramatically longer too. not in a clean proportional way. it starts pulling in framing, caveats, alternatives, best practices, and all this preemptive extra shit you did not ask for.

i even asked why it does that and got the exact answer you'd expect: trying to be helpful, trying to cover gotchas, trying to preempt follow-ups. yes. that's the problem.

tl;dr: if you want a simple answer, you have to send a simple prompt. the longer your prompt is, the longer the output gets, and it does not scale linearly.