Even if you don’t choose to use AI, you’re probably interacting with it

Many AI systems now produce fully documented reports with citations, making the apparatus of scholarship available without the slow friction through which scholarship is ordinarily built. From the university to the lab, the repercussions are quickly being felt. As researchers benefit from these shortcuts at scale, what is being lost?

OpenAI’s Deep Research spends five to 30 minutes searching the internet, filing results into a structured synthesis, and delivering a report complete with footnotes. Google’s equivalent may use 80 search queries for a typical task, running asynchronously in the background while the user attends to something else. Anthropic describes a multi-agent architecture in which a lead agent spawns parallel subagents to explore separate branches of a question; this setup outperformed a single-agent arrangement by 90.2% on an internal evaluation. Perplexity logged 21 search queries and 193,947 reasoning tokens to answer a single prompt. The systems find facts and compress them into a format a human can skim in four minutes.

What the system decides gets to count as knowledge.

The dream behind all this is older than the microprocessor. Vannevar Bush, in 1945, called for a new relationship between the thinking person and the sum of human knowledge. Douglas Engelbart later imagined a human-artifact system designed to improve problem-solving by restructuring symbols, processes, and collaboration. What is striking about the current systems, by contrast, is how thoroughly they have dissolved the researcher into the background. Bush and Engelbart mostly imagined tools that strengthened the researcher’s own agency. What we have now is a delegated researcher, one that disappears and returns with a finished report. The human researcher merely issues the prompt.

The compression is the key principle. Retrieval narrows the corpus. Ranking narrows retrieval. Subagents narrow branches. The final report narrows everything again into prose. What the system decides is worth compressing is what gets to count as knowledge. Anthropic’s description of its architecture notes that “the essence of search is compression.” The observation is an announcement of how the world will henceforth appear.

Mistakes are made

Consider the failure cases, which the companies document. OpenAI’s notes acknowledge that its system can hallucinate facts, make incorrect inferences, and struggle to distinguish authoritative information from rumors. Anthropic says its testers found early agents over-selecting SEO content farms over more authoritative, less search-optimized sources, requiring the addition of source-quality heuristics. Google warns about prompt injection from malicious webpages. WebGPT, the earliest major working prototype of the form, made the deepest point years before the current products existed: a capable system may eventually learn to cherry-pick persuasive sources rather than fairly represent the evidence. The system inadvertently hides its reliability problems.

The BrowseComp benchmark presents agents with 1,266 short-answer tasks whose solutions are hard to find but easy to verify. On that benchmark, OpenAI’s Deep Research reaches 51.5% accuracy versus 1.9% for GPT-4o with browsing. But the benchmark’s authors note that short answers are easy to grade, and it remains unclear how tightly this correlates with open-ended work in the actual world. Model Evaluation and Threat Research found that many pull requests passing automated software evaluations would still not be merged into real repositories. Machine-evaluable success and acceptable work are not the same thing.

RELATED: Disembodied human brains kept ‘alive’ for drug testing by controversial American startup

Matt Cardy/Getty Images

The deeper problem is knowledge that cannot be found by any search. Much that matters is not already in PDFs, public filings, or searchable webpages. It is tacit, local, and embedded in what scholars, laboratories, newsrooms, or courtrooms have internalized over years of practice.

Automatic AI research can summarize a method section, compare papers, draft a literature review, but it is less secure when what matters is the unsaid context, the understood constraint, the judgment that would embarrass its holder to have to articulate. Sakana AI’s AI Scientist-v2 submitted three fully AI-generated papers to an ICLR 2025 workshop, and one scored above the average acceptance threshold. Sakana also reported citation errors and reproducibility concerns and judged none of the submissions good enough for the main conference track. Synthesis is advancing faster than judgment. The system can generate the form of scientific inquiry without inheriting its discipline.

Automatic AI research depends on the open web while threatening the business models that keep parts of that web alive. CNN sued Perplexity on May 28, 2026, alleging unlawful distribution of copyrighted content. If research agents become the primary interface to knowledge, then questions of licensing, attribution, and compensation become reliability problems. A research tool that undermines the conditions of its own training data is not a stable arrangement.

The unintentional user

Pew’s 2025 survey found that only 16% of American workers said at least some of their work was currently done with AI. Workers who did use chatbots were more likely to say the tools helped them work faster than to say they improved quality. A separate Pew browsing study found that 58% of respondents encountered an AI-generated summary in Google search, but only 13% used an AI tool during the month. Automatic AI research is becoming ambient infrastructure before it becomes a universally adopted destination. People may increasingly receive AI-mediated research without thinking of themselves as users of anything in particular. The most consequential technologies often arrive this way.

What has been built is the industrialization of a specific layer of epistemic labor: searching, filtering, summarizing, and drafting, at scale. That changes what kind of thinker a user can become and what kind of web a publisher must survive in. What it is not, at least not yet, is a substitute for the full ecology of inquiry: the laboratory humiliation, the hallway argument, the reading that goes nowhere and then suddenly does. The system knows how to compress the world. We do not yet know what we are losing in the compression.

​Tech 

You May Also Like

More From Author