I Reverse‑Engineer ChatGPT’s Hidden Search Using a Chrome Extension
Photo by Zulfugar Karimov (unsplash.com/@zulfugarkarimov) on Unsplash
While users expect a single, clean answer from ChatGPT’s Browse mode, reports indicate the model actually fires 3‑12 covert search queries, sifts through dozens of sources, and selects citations unseen to the user.
Quick Summary
- •While users expect a single, clean answer from ChatGPT’s Browse mode, reports indicate the model actually fires 3‑12 covert search queries, sifts through dozens of sources, and selects citations unseen to the user.
- •Key company: ChatGPT
The Chrome extension that William C. built reveals a hidden, multi‑step search pipeline inside ChatGPT’s Browse mode, a process that ordinary users never see. By hijacking the browser’s fetch API before OpenAI’s front‑end scripts load, the extension clones the response from the /conversation endpoint and parses the Server‑Sent Events (SSE) stream that carries a series of JSON‑Patch operations (RFC 6902). Those patches, which incrementally mutate a response object, embed the actual search queries, source URLs, and citation decisions before any textual answer is streamed to the UI. As C. explains, “the search queries are embedded in the stream before the text response appears,” meaning the model decides what to look up, reads the results, and only then begins composing its reply.
Analyzing more than 500 sessions, C. found that a single user prompt typically triggers 8.2 distinct search queries on average. The queries are not simple rephrasings; they are strategic reformulations that target different facets of the answer. For example, when asked “Is my website optimized for AI search?” the model issued queries such as “AI search optimization techniques 2026,” “GEO generative engine optimization checklist,” and “Schema.org markup AI citations.” This “query multiplication” suggests that ChatGPT is performing a breadth‑first information sweep rather than a single lookup, a behavior that aligns with OpenAI’s own description of its “deep research” agent, which uses reasoning to gather multiple sources before responding (Bloomberg).
The extension also quantifies a “Reformulation Gap”: 47 % of the queries differ semantically from the user’s original phrasing. C. calls this the gap the model’s internal rewrite of intent, aimed at surfacing higher‑quality evidence. In practice, the model may discard the user’s exact wording in favor of terms it deems more likely to retrieve relevant documents. This gap raises questions about transparency, because the citations displayed in the UI are a curated subset of the dozens of sources the model actually consulted. Users see only the final, hand‑picked references, while the underlying search activity remains opaque.
Technically, the extension’s interception works by injecting a script in the Chrome Manifest V3 “MAIN” world, overriding window.fetch with a wrapper that clones the response and feeds its body into a custom SSE parser. The parser reads the stream chunk by chunk, decodes each line, and extracts JSON data from lines prefixed with data:. Each JSON object contains a JSON‑Patch operation, such as an add on a path like /message/content/parts/0, which the extension interprets as a hidden query or citation. By preserving the original response (via response.clone()), the wrapper avoids breaking the UI while still exposing the raw data to the developer console.
The findings have immediate implications for both users and developers. For SEO practitioners, the fact that ChatGPT routinely rewrites queries means that optimizing content for the exact phrasing a user might type is insufficient; the model’s internal reformulations could favor alternative terminology. For enterprises that rely on ChatGPT’s citations to validate information, the hidden search layer suggests that the model’s confidence may be based on a broader evidence set than the displayed references imply. OpenAI’s own research portal notes that its “deep research” agents are designed to reason across multiple sources, but the Chrome extension provides the first concrete, user‑level view of that process (Bloomberg).
Finally, the reverse‑engineering effort underscores a broader tension between AI transparency and proprietary architecture. While C.’s extension is a clever, open‑source probe into OpenAI’s internals, it also demonstrates how much of the model’s decision‑making pipeline is deliberately concealed from end users. As AI assistants become more embedded in workflows, the demand for tools that surface these hidden steps will likely grow, prompting both developers and regulators to reconsider what “visible” AI truly means.
Sources
No primary source found (coverage-based)
- Dev.to AI Tag
This article was created using AI technology and reviewed by the SectorHQ editorial team for accuracy and quality.