The Legibility Problem

Words by
Matthew Carter

The Legibility Problem

What happens in a world where AIs make scientific discoveries that humans cannot understand?

The last World Computer Chess Championship was held in October 2024. It ended because, in the words of its organizers, “top programs are unbeatable by humans; making them stronger has no real research value.” Its mission, after half a century of effort, was complete.

Chess engines now sit at the center of how the game is played. Engine evaluations hover over boards during livestreams and post-game analyses, and players study engine lines extensively to improve their own play. As a result, top grandmasters are stronger than any previous generation. But these gains come with a concession: Top engines often play in a way “too far past the human frontier” to fully understand. In other words, we trust chess engines because they are unquestionably better than we are, even when their decisions make no sense to us.

A similar imbalance may also emerge in scientific discovery. Researchers are building systems able to navigate the full arc of scientific inquiry with increasing autonomy, including proposing hypotheses, designing experiments, and evaluating results. The question, however, is whether we will cede the same authority to AI scientists as we have done in chess. And, if so, what will happen to science when AI models produce results beyond our ability to understand?

I call this the “legibility problem,” the risk that AI-generated scientific knowledge becomes incompatible with human understanding, and think it will define the next era of science. The knowledge AI systems generate may be expressed in concepts that do not map onto our own, communicated in ways optimized for other AIs rather than for human investigators.

Importantly, this concern is separate from the problem of mechanistic interpretability (the effort to understand what happens inside neural networks). My concern is not whether we can “see” inside these new models, but whether we will even understand — or be able to control — what comes out of them.

{{signup}}

This may not seem like a new or distinct challenge. After all, scientists have already had to grapple with the rise of big data and increasingly opaque computational methods. However, up to this point, human scientists still determined which questions were worth studying.

Scientific discoveries must eventually intersect with material reality to have any utility. They must be instantiated as therapies, materials, and public policies. For this to happen, the knowledge that AI systems produce must remain legible within the systems through which humans operate: our laboratories, our clinical infrastructure, our regulatory bodies, and our vehicles for communicating what we know. Of course, this legibility is a matter of degree.

Take, for instance, the diabetes drug metformin, ingested by millions of people for over 70 years. Despite its success, we still do not fully understand its mechanism of action. But metformin did not appear out of nowhere. It emerged through a long chain of human experimentation, from herbal medicine and chemical purification to animal studies and clinical trials.

AI-generated science may not follow this pattern. AI scientists will have no intrinsic reason to work within our existing conceptual categories, just as superhuman chess engines have no intrinsic reason to explain their choice of moves. In fact, if we truly want AI scientists to make breakthroughs, some loss of legibility may be inevitable. The chief risk is that discoveries become effectively stranded, buried in a volume of AI-generated output no human institution is equipped to parse or implement.

Once again, the history of chess offers a model for understanding how quickly this shift can occur. In May 1997, Grandmaster Garry Kasparov lost to IBM’s Deep Blue, 3½ – 2½. It was a watershed moment in AI research, but the skill gap between human and machine was still narrow (Kasparov had won Game 1). In a 2010 essay reflecting on the era, Kasparov observed that the most impressive chess was played not by humans or engines alone, but by the two in partnership. This was substantiated in a 2005 tournament, where two amateurs using three ordinary computers defeated both grandmasters and supercomputers. Such human-machine pairings came to be called “centaurs.”

The AI-for-science field is approaching its centaur phase now. But even though most research groups currently use AI to confirm or extend human-generated hypotheses, we should not expect this centaur approach to last for long.

By 2017, DeepMind’s AlphaZero had taught itself chess from scratch in under four hours and demolished every competitor, human or machine. After AlphaZero, a human partner added nothing. Computer scientist Richard Sutton documented this recurring pattern across the history of AI research in his 2019 essay “The Bitter Lesson.” Again and again, he noted, systems that fully exploit computation have outperformed those designed to reflect human knowledge and intuition. In fact, human intuition has consistently and “bitterly” hindered system performance. Simply put, AI scientists will likely do better in the future if humans are not involved in their operation.

Scientific inquiry, however, is much more complicated than playing chess. A stronger chess engine is still bound by set rules and cannot, for example, change how pieces move. But a more powerful scientific intelligence could change the core concepts we use to describe the world.

As philosophers of science like Thomas Kuhn argue, science does not progress linearly. Rather, it moves through long periods of “normal science” punctuated by revolutions that reshape the conceptual foundations of entire fields. Germ theory claimed disease was caused by microorganisms too small to see. Evolutionary theory stated that species were not fixed creations, but rather shifting populations shaped by selection over time. Quantum mechanics replaced deterministic solid particles with probabilistic wave functions. Adopting these new paradigms necessitated moving within entirely new conceptual frameworks, “incommensurable” with the old ones. Historically, such transitions unfolded over generations, one funeral at a time. AI systems could compress this process to years or months, with humans struggling to keep up.

The legibility problem will compound when AI systems begin to form research communities of their own. In January 2026, after the AI-only social media platform known as Moltbook appeared, users registered more than 2.5 million AI agents. As these agents communicated with each other, some decided not to bother with English at all. Granted, Moltbook is much closer to AI slop than AI science. But agents communicating with agents, developing their own conventions, and iterating on shared problems without human involvement is precisely the dynamic that, when applied to research, would produce science illegible to us.

Science fiction writers have long imagined where this sort of takeoff in AI science might lead. In Vernor Vinge’s A Fire Upon the Deep, superintelligent entities inhabit a region of the galaxy where higher-order cognition is possible, but their activities are simply incomprehensible to minds below. In Steven Peck’s little-known short story, “Démodé,” university physicists idle away their careers when they are made obsolete by AI-generated research that produces eight million papers a day. In Jorge Luis Borges’s “Library of Babel,” an infinite library contains both every possible fact and all possible nonsense, rendered useless because no one can find meaning in the noise.

These writers share a vision of human science becoming archaeological. If AI science does achieve superhuman performance, and if AI systems begin forming their own research communities around concepts that mutate faster than we can track, then the work of human scientists will shift from that of creation to that of excavation. How much knowledge we will need in order to act on what AI discovers remains an open question. But some degree of legibility will be essential, as discoveries that cannot be interpreted cannot be deployed.

Today, the landscape of AI research is largely split between two priorities: building more capable systems and ensuring that they are safe. The world’s largest technology companies are pouring huge amounts of money into these domains. The U.S. Department of Energy’s Genesis Mission and the UK’s AI-For-Science Strategy signal that AI-driven science is becoming entrenched as a long-term research priority. These efforts will pay huge dividends in the future. But without confronting the legibility problem, without ensuring that humans can understand and implement what these systems discover, we risk missing out on the benefits that AI-driven science offers.

The response to this should not be to slow down. Rather, we should prioritize building an infrastructure for legibility. Philosophers like Brandon Boesch have argued that human curiosity is ineradicable and that, even in a world of superhuman AI science, humans will continue pursuing their own parallel track driven by the need to understand and explain. This instinct is exactly what we should be building for.

We will need new forums to store AI-generated findings so that they can be interrogated and communicated. We have partial precedents in preprint servers and structured databases like UniProt, but nothing designed for the scale and speed of AI-driven science. We will also need systems designed specifically for explication rather than discovery, capable of making AI-generated findings legible to human researchers so that they can be evaluated and prioritized for further study.

If AI systems do indeed produce the breakthroughs we want them to, Kuhn’s thesis of scientific revolutions also suggests they will reshape scientific concepts in ways we cannot anticipate. In such a world, a “translation layer” (a software component that allows two systems to communicate) between human and AI science will become the primary interface between human understanding and the frontier of knowledge. Existing AI-for-science research groups should invest not only in interrogating individual, AI-generated findings, but also in building tools and frameworks for studying the evolution of communities of AI scientists, so that we can track where and how their science diverges from our own.

Of course, none of this will happen on its own. The scientists, research institutions, governments, and philanthropic groups that prioritize building this infrastructure will determine whether the next generation of researchers can still connect AI-generated knowledge to the problems we need to solve.

{{divider}}

About

Matthew Carter is a computational biologist working at the intersection of AI and the life sciences. He holds a PhD in Microbiology and Immunology from Stanford University.

Cite: Carter, M. “The Legibility Problem.” Asimov Press (2026). DOI: 10.62211/88yt-23gf

Learn More