Beyond the encoding gap: Metafeatures, Swarms, and the Architecture of Discovery

On Artificial Discovery and Its Human Cost

February 08, 2026

Core thesis: Large language models have not just memorised what humans know. They have absorbed how humans think: the recursive process of questioning, testing, and refining that produces all knowledge. This is why they are now solving problems once thought beyond the reach of machines. But as AI takes over the work of discovery, we risk losing the understanding that only comes from doing the work ourselves.

Introduction: The Simulacrum and Its Critics

A profound tension defines the frontier of artificial intelligence. On one side stands a chorus of informed scepticism. Chollet (2019) argues that Large Language Models (LLMs) are interpolation engines, constrained by their training data and helpless before tasks where no objective ground truth exists. Marcus (2020) contends that deep learning alone cannot achieve robust intelligence without structured, symbolic reasoning. Mitchell (2019) questions whether LLMs possess genuine understanding or merely fluent mimicry. Shanahan (2024) frames the issue starkly: LLMs are discourse simulators, role-playing as reasoners rather than actually reasoning. Kambhampati et al. (2024) demonstrate that LLMs cannot plan autonomously but can serve as idea generators when paired with external verification, a framework they call LLM-Modulo.

On the other side lies a rapidly accumulating body of empirical evidence from 2025 and 2026. AI systems have autonomously resolved open mathematical conjectures, verified by Terence Tao (2025). A de novo theoretical physics contribution has been published in Physics Letters B (Hsu, 2025). Creative leaps that decades of human effort had not produced are appearing with increasing frequency. Crucially, these results are not recombinations of known strategies: the mathematical resolutions deployed approaches absent from the prior human literature on those specific problems, and mechanistic analysis of transformer weights reveals not memorised surface statistics but compositional semantic primitives; specifically, structured relational features that go beyond pattern retrieval (Im et al., 2026).

Reconciling these realities requires a re-evaluation of what LLMs have actually learned. They have not merely memorised the destinations of human knowledge: facts, equations, code snippets. They have internalised the fractal generative process of human inquiry itself, a recursive, scale-invariant architecture of choice-making that operates identically whether one is selecting the next word in a sentence, the next argument in a debate, or the next paradigm in a research programme. By absorbing the deep metafeatures of this process (abstraction, analogy, parsimony, negation) they have acquired a compass that functions even where the map is blank.

The sceptics are right that these domains lack immediate objective feedback. But absence of feedback is not absence of structure. The domains where AI is now producing breakthroughs (e.g., material science, theoretical physics, software architecture) are saturated with higher-order regularities that function as implicit rails. The barrier to AI performance in these domains is temporal and combinatorial, not ontological. This essay traces how an architecture of discovery, combining inverse design, adversarial agent swarms, and physically grounded simulation, is crossing the threshold from retrieving known truths to building entirely new ones. It also argues that several of the sceptics' own frameworks converge with this architecture: Kambhampati's LLM-Modulo verification loop, Chollet's constructive push toward genuine abstraction, and the broader demand that generation and verification be set in productive tension. The debate is not about whether AI is useful but about what kind of intelligence it has acquired and what it still lacks. Yet this transition brings a critical risk: when machines undertake the friction of discovery on our behalf, humans may gain the artifact but lose the understanding that the struggle itself produces. I have elsewhere called this phenomenon the Encoding Gap (Vasa, 2025).

I. The Fractal Generative Process: A Choice-Making Architecture

Human text, as it accumulates across centuries of scientific papers, philosophical arguments, engineering reports, and literary works, is not merely a repository of claims. It is a compressed trace of a deeper generative process: the recursive navigation of possibility spaces through selection, expression, negation, and refinement.

This process has been independently recognised across multiple intellectual traditions. Peirce's (1992) triadic cycle of inquiry describes its logical skeleton: abduction (generating hypotheses from surprising observations), deduction (deriving testable predictions), and induction (evaluating predictions against evidence). Popper's (1963) conjecture-and-refutation model captures its adversarial dynamics: bold conjecture followed by sustained attempts at falsification. What I propose here is a synthesis of these converging traditions, decomposed into five recurring stages:

Perception of a space of possibilities: open questions, tensions, anomalies.
Selection of a trajectory via heuristics: elegance, analogy, parsimony, prior success patterns.
Negation and variation: counterarguments, edge cases, alternative framings.
Expression and externalisation (writing, speaking, modelling) to make the trajectory criticisable.
Integration of feedback (empirical, logical, or social) to update latent priors.

This loop recurs at every scale of human inquiry, a property I describe as structurally recursive. The same pattern of hypothesis-test-revise appears whether one is constructing a sentence, mounting a rebuttal, or reorienting a research programme. I use "fractal" in this qualitative sense of self-similar recurrence across scales, not in the strict mathematical sense of measurable fractal dimension.

Large language models, trained on vast corpora encoding such loops at every scale, have statistically internalised this structurally recursive mechanism. They are not merely memorising outcomes; they are simulating the generative process itself, hence the simulacra quality that so troubles critics. The transformer's attention mechanism (Vaswani et al., 2017) offers a suggestive architectural parallel: at each layer, it attends to patterns at increasing levels of abstraction. I do not claim a demonstrated mechanistic correspondence between attention layers and the five stages above. That remains an open empirical question. But the architectural capacity for hierarchical, context-sensitive pattern composition is at minimum consistent with the internalisation of a structurally recursive process. Recent work by Im et al. (2026) provides initial mechanistic evidence in this direction. By deriving closed-form expressions for transformer weights at early training stages, they show that learned weights decompose into three interpretable basis functions: a bigram mapping capturing local sequential dependencies, an interchangeability mapping capturing functional similarity across tokens that share grammatical or semantic roles, and a context mapping encoding longer-range co-occurrence patterns. These are not raw corpus statistics but compositional semantic primitives. The interchangeability mapping is particularly telling: tokens that play analogous roles in different contexts are grouped together, a form of abstraction at the weight level. Validated against both toy transformers and the Pythia-1.4B model trained on real-world web text, these results suggest that even at the level of individual weight matrices, transformers are learning structured relational features that go beyond memorisation. This does not demonstrate the full five-stage correspondence proposed above, but it establishes that the building blocks of the weights are compositional and semantic in character, precisely the kind of substrate on which higher-order metafeatures could be encoded.

The metafeatures that characterise this process (parsimony, coherence, analogy, negation) are not an ad hoc list. They correspond to the regulative principles that recur across the independent formulations cited above: Peirce's economy of hypotheses, Popper's demand for falsifiability (a form of negation), and the broader epistemological requirement that explanations cohere with prior knowledge. These are the heuristics that constrain the search space of inquiry at every scale.

Because the mechanism is distilled from the aggregate of historical human traversal, LLMs can operate in the same creative arena as humans, provided human creativity does not draw from a source genuinely beyond the historical record. The question of whether non-computable or non-representational sources of insight exist is a live debate in philosophy of mind, with substantive positions on both sides (Penrose's Gödelian arguments, Dreyfus's phenomenological critiques, and computationalist rebuttals from Dennett and others). I do not claim this question is settled. I observe only that on empirical evidence to date, the breakthroughs cited in this essay do not require non-computable sources to be explained. The compressed generative process appears sufficient. Should evidence of genuinely non-computable human insight emerge, the scope of this thesis would narrow accordingly.

This is the essay's central interpretive framework: the fractal generative process is proposed as the latent structure that makes apparently "non-verifiable" domains navigable. The empirical sections that follow test this framework against evidence from multiple domains.

II. The Trap of the Forward Model: Fishing in a Stagnant Pond

To understand why the generative process matters as more than a theoretical framework, consider the limitations of the current scientific workflow. A significant proportion of modern research still relies on a manual, extractive process. Researchers scan vast volumes of literature to identify the characteristics and properties of materials under different process conditions, such as how a specific alloy behaves under varying construction or test regimes. This form of characterisation exposes components of the truth but remains fundamentally limited.

The best we can typically achieve in this paradigm is to build predictive models: Random Forests, gradient-boosted trees, or shallow neural networks. These models excel at finding correlations within existing data, effectively asking: "Given a structure, predict the property." However, this is a forward model, constrained by the deductive closure of its inputs. It can only derive conclusions that are logical consequences of the patterns already present in the training data. This limitation is broadly consistent with Chollet's (2019) argument that systems trained to minimise surprise relative to a fixed distribution cannot transcend that distribution. Marcus (2020) reaches a similar conclusion from a different direction: without structured knowledge representations that encode causal relationships, neural networks remain brittle pattern-matchers unable to generalise robustly. If you train a system to maximise plausibility based on existing literature, you effectively train it to suppress the anomalies where scientific revolutions live.

The forward model captures only the lowest scale of the generative process: the step from known premises to known conclusions. It cannot recapitulate the perception of anomalies, the selection of novel trajectories, or the negation that drives inquiry beyond its current boundaries. You cannot interpolate your way to a paradigm shift, because statistical learning over a closed theoretical system cannot derive phenomena that the system's axioms rule out. This approach leaves us fishing in the stagnant pond of known truths.

III. Reframing the Non-Verifiable: Navigating the Fog

The sceptic's critique argues that moving beyond the pond is impossible for AI because "non-verifiable" tasks lack objective structure. Without an immediate loss function (a compiler error, a benchmark score) an AI cannot iterate and will simply hallucinate.

However, the binary of verifiable versus non-verifiable is a simplification. Most high-value human work exists in a middle ground better described as delayed-decidable: tasks where feedback exists in principle and will eventually arrive, through expert review, experimental testing, market outcomes, or downstream consequences, but on timescales far longer than the immediate decision.

This category is deliberately bounded. I am not claiming that all non-verifiable tasks are delayed-decidable. Genuinely undecidable problems exist: the halting problem, Gödelian sentences, and perhaps certain irreducibly subjective aesthetic judgements where no consensus will ever form. The claim is narrower and more precise. A large and practically important class of tasks that appear non-verifiable in the moment are in fact delayed-decidable, and this class includes much of what matters in material science, theoretical research, and software architecture.

Consider the analogy of navigation. Verifiable tasks (writing code that must compile, solving a textbook problem with a known answer) are like driving in a city with GPS. Feedback is immediate: turn down a one-way street and the environment signals "wrong." LLMs excel here because the error signal is clear.

Non-verifiable tasks (designing a corrosion-resistant alloy for a novel environment, formulating a new theorem, architecting a complex software system) are like navigating a foggy forest. There are no signs. You may not know whether a decision was correct for months or years. The sceptics argue that without signs, navigation is impossible for machines.

Yet human experts navigate the fog successfully. They rely on metafeatures: higher-order heuristics like "follow the slope of the land" (gradient of utility), "avoid dense thickets" (parsimony), or "backtrack when the air grows stale" (negation of unproductive paths). These are precisely the regulative principles of the generative process described in Section I, the same heuristics that Peirce and Popper identified as governing productive inquiry.

Kambhampati et al. (2024) offer a revealing bridge here. Their LLM-Modulo framework concedes that LLMs cannot plan or verify autonomously, but demonstrates that they can serve as powerful generators when coupled with external critics, solvers, and verifiers. This is not a refutation of the present thesis; it is a convergent formulation. The delayed-decidable framework makes the same structural point: LLMs navigate using internalised metafeatures, and the "modulo" verification arrives later, whether from an external solver, an experimental test, or expert review. What Kambhampati frames as an architectural requirement, I frame as a temporal one. Both recognise that the generative and verificatory functions can be separated.

The critical reframing is this: the fog is temporal and combinatorial, not ontological. Delayed-decidable tasks lack immediate objective structure, but they are saturated with higher-order regularities (coherence, predictive power, parsimony, elegance) that function as implicit rails. LLMs, having trained on vast corpora of human "hiking logs" (our collective textual history), have distilled these metafeatures deeply enough to navigate productively even where feedback arrives slowly or indirectly.

The "non-verifiable," at least within the delayed-decidable class, was never undecidable. It was waiting for a traveller with the right compass.

IV. The Evidence: When AI Traverses the Fog

A thesis this bold demands empirical evidence. The period from late 2025 to early 2026 has furnished it, though the evidence requires careful interpretation.

In mathematics, AI systems autonomously contributed to the resolution of several open Erdos problems, including problems #728, #729, and #1026, with novel arguments that were formally verified, in some cases by Terence Tao himself (Tao, 2025). These conjectures had resisted human effort for decades, precisely because the search space lacked the clear, step-by-step feedback of a textbook proof. The problems inhabited the fog. Yet AI traversed it, not by brute-force search, but by deploying the kind of analogical and structural reasoning that characterises the generative process: perceiving latent connections across mathematical sub-domains, selecting promising trajectories via learned heuristics of elegance and consistency, and iterating until a proof crystallised.

In theoretical physics, Hsu (2025) published a paper in Physics Letters B on relativistic covariance and nonlinear quantum mechanics in which the core idea, a genuinely novel theoretical contribution, originated de novo from GPT-5. Hsu has described this as potentially the first published physics paper where the central conceptual advance was proposed by an LLM. That the result has attracted substantive critical engagement (Oppenheim, 2025, has challenged the sufficiency of the proposed criterion for relativistic covariance) rather than dismissal, confirms that the AI-generated idea operates within the generative process of legitimate scientific inquiry. The debate is scientific, not about whether the contribution is meaningful.

A candid assessment requires acknowledging what this evidence does not establish. These are selected successes. We do not know the base rate: how many AI-generated proofs or physics conjectures were attempted, rejected as trivial, or found to be incorrect for every one that succeeded. Without this denominator, the evidence is subject to survivorship bias. What the evidence does establish is that autonomous AI traversal of delayed-decidable domains is possible, a claim that the sceptical frameworks of Chollet, Marcus, and Mitchell would each predict to be unlikely, and that the trajectory across 2025-2026 shows increasing frequency and reliability of such traversals rather than isolated flukes.

The sceptics would likely respond that mathematics and physics succeed because they are rich with historical priors, centuries of text encoding deep patterns, proofs, and heuristics, and that success involves eventual human verification. Shanahan (2024) would add that the AI is not "understanding" the mathematics but simulating the discourse of mathematical reasoning, which in a rich-prior domain may be sufficient to produce correct results without genuine comprehension. These are serious counter-arguments.

But they do not fully explain the results. If these were merely sophisticated interpolation within a high-prior domain, or discourse simulation without understanding, we would expect the AI to produce results that are recognisably similar to existing approaches: recombinations of known strategies. The Erdos resolutions deployed strategies not previously seen in the human literature on those specific problems. The novelty is not merely in the answer but in the method of arriving at the answer, precisely the signature of a system that has internalised the generative process, not just the products of prior traversals. This does not definitively defeat the interpolation or simulation counter-arguments, but it shifts the burden: the sceptic must explain how interpolation or role-playing produces strategies that are novel relative to the training distribution. Mechanistic evidence from Im et al. (2026) adds a further complication for the discourse-simulation view: if the weights encode compositional semantic primitives (functional categories, contextual relationships) rather than surface co-occurrence patterns, then what the model has learned is closer to structured relational knowledge than to a script for role-playing.

I should also note what this evidence does not address. Chollet's ARC-AGI benchmark (Chollet, 2019), developed as a constructive tool to measure and push frontier abstraction capability, tests a different axis of intelligence: sample-efficient acquisition of novel abstractions from minimal examples. Van Rooij et al. (2024) raise an even more fundamental concern, arguing that the computational problems underlying human-level cognition may be intractable in principle, meaning that scaling alone cannot bridge the gap regardless of architecture. My thesis does not require that LLMs pass ARC or overcome intractability barriers. It requires that they can navigate delayed-decidable domains where the training corpus is rich. These are distinct claims, and the strength of one does not refute the validity of the other.

V. Latent Structure in the Fog: The Packet Analysis Exemplar

A critic might still demand evidence that higher-order structure exists within apparently opaque, non-verifiable data. The analysis of network traffic provides a useful exemplar from an entirely different domain, though its evidential role in the larger argument must be stated precisely.

To a human analyst, a single encrypted network packet is a non-verifiable data point. It appears opaque, context-less, and effectively random. It lacks obvious "objective structure."

Yet, in empirical work led by Aleks Pasquini, we demonstrated that transformer models, even early architectures based on GPT-2, could identify specific device types from a single packet: distinguishing a camera from a microphone, a sensor from a controller (Pasquini, 2024). The model does not memorise the bits, which change constantly. Instead, it captures the latent semantic behaviour of the device, the generative grammar of how a camera communicates versus how a sensor communicates. It identifies a signal in the noise that is invisible to traditional inspection.

To be precise: packet classification is a supervised pattern-recognition task with known ground truth, structurally different from the autonomous creative traversal discussed in Section IV. The exemplar does not directly demonstrate that AI can create in delayed-decidable domains. What it demonstrates is the necessary precondition: that apparently opaque, "structureless" data does contain higher-order regularities that high-dimensional models can detect. This is the weaker but essential claim. If the fog were truly devoid of structure, neither classification nor creative traversal would be possible. The packet analysis establishes that the fog has structure; the Erdos and physics results establish that the structure can be navigated creatively.

VI. The Architecture of Discovery: Inverse Design and NSGAN

How do we systematise this capability and escape the limitations of the forward model? The answer lies in inverse design: instead of asking "given a structure, predict the property," we ask "given a desired set of properties, design the structure that generates them." This requires the AI to internalise the causal physics of a domain and invert them, running the generative process in the direction of creation rather than prediction.

A powerful exemplar is the NSGAN (Non-dominated Sorting optimisation-based Generative Adversarial Network) framework. Li and Birbilis (2024) introduced NSGAN for alloy discovery, integrating evolutionary multi-objective optimisation (NSGA-II) with generative modelling. The framework evolves candidate solutions in the GAN's latent space, guided by surrogate machine learning models that predict multiple conflicting properties (such as tensile strength versus ductility). Rather than simply replaying known alloys, NSGAN aggressively explores the Pareto frontier, the thin, elusive boundary of optimal trade-offs that human intuition almost invariably misses, while remaining constrained to physically plausible compositions by the GAN's learned manifold.

Critics might argue that such tools are merely "forward models in reverse," still bound by the statistical distributions of their training data. There is validity to this: a GAN cannot easily predict a phenomenon forbidden by its training set. However, this critique misses the true nature of the fog in material science. The challenge is often not a lack of fundamental physics but the combinatorial vastness of the search space. The number of possible multi-component alloy compositions, when accounting for compositional variation, processing conditions, and phase structures, is astronomically large. NSGAN acts as a compass in this vastness, finding islands of stability that, while theoretically consistent with known physics, were effectively invisible to human researchers.

In our own recent work (Ghorbani et al., 2025), we extended the NSGAN framework to the domain of corrosion in multi-principal element alloys (MPEAs). The experimental literature on MPEA corrosion is severely sparse: a handful of compositions tested under a limited range of electrolytes and conditions. This sparsity is itself an instance of the fog. The space of possible compositions, phases, and environmental interactions is combinatorially vast, while our empirical data illuminates only a few scattered clearings. Our approach used NSGAN to generate 200,000 physically plausible synthetic data points, with NSGA-II ensuring diverse, non-dominated coverage across the composition-electrolyte space. This generative augmentation dramatically improved the predictive accuracy of forward models for key corrosion metrics: corrosion current density (i_corr), corrosion potential (E_corr), and pitting potential (E_pit), all exceeding models trained on experimental data alone.

While this application was generative augmentation rather than direct inverse optimisation, it demonstrates a critical prerequisite for the broader vision: the GAN had internalised the latent manifold of MPEA corrosion behaviour well enough to generate compositions that were both physically plausible and statistically diverse. The machine learned the generative grammar of how alloy composition, phase structure, and electrolyte interact to produce corrosion behaviour. This is the foundation upon which targeted inverse design ("design me an alloy with this pitting resistance and that passivation stability") can now be built. The original NSGAN framework (Li and Birbilis, 2024) demonstrated this capability for mechanical properties; our corrosion work extends the reach of the compass into degradation science. The calibration is underway; the next step is to set the destination.

VII. The Engine of Novelty: Adversarial Swarms and Physical Grounding

To fully realise the potential of inverse design, we must introduce conflict. This is achieved through adversarial swarms, a mechanism that mirrors the scientific method itself and instantiates the generative process at the level of multi-agent systems.

In frameworks like SPARKS (Ghafarollahi and Buehler, 2025b) and SciAgents (Ghafarollahi and Buehler, 2025a), specialised agents emerge with distinct roles. The Explainer Agent attempts to compress observed data into a universal principle or scaling law, minimising complexity while maximising explanatory power. In the context of materials, it might formulate a unified theory for how specific process conditions dictate corrosion rates. The Breaker Agent actively hunts for anomalies, generating hypotheses or searching for data that violates the Explainer's current principle.

This recursive loop of explanation and negation drives the system beyond its training distribution and into the realm of genuine novelty. It recapitulates the negation and integration stages of the generative process at the system level. Individual agents, perhaps limited in their own capability, combine to produce a collective structure that exceeds the sum of its parts. The swarm differentiates without special training, with agents self-organising into critics, planners, and verifiers. Related work on bio-inspired material intelligence (Marom and Buehler, 2025) and de novo protein generation (Ni, Kaplan and Buehler, 2024) further demonstrates that generative adversarial architectures can traverse creative spaces far beyond their training distributions when properly structured.

It is worth noting the convergence between adversarial swarms and Kambhampati's LLM-Modulo framework. Both recognise that the generative capacity of LLMs must be paired with adversarial or verificatory processes to produce reliable novelty. In LLM-Modulo, external solvers and critics check the LLM's proposals. In adversarial swarms, the Breaker agent serves an analogous function from within the system itself. The architectural insight is the same: generation and verification must be separated and set in tension. My contribution is to observe that this tension recapitulates the fractal generative process at the system level.

Yet even this recapitulation has a boundary. No matter how sophisticated the swarm, an AI scientist cannot exist solely in a world of probabilities. As Feynman (1986) warned in his appendix to the Challenger disaster report, "for a successful technology, reality must take precedence over public relations, for Nature cannot be fooled". The fog of delayed-decidable tasks must eventually clear into the grid of physical reality.

This necessitates physical grounding: the integration of AI agent swarms with automated laboratories, additive manufacturing, and robotic testing facilities. Consider the complete loop for a new alloy. The AI swarm proposes a composition via inverse design. A robotic foundry fabricates the material. Automated testing evaluates stress-corrosion cracking. The results feed back into the swarm, converting a delayed-decidable hypothesis into a verified data point. This loop is currently aspirational. Neither the NSGAN papers nor the SciAgents framework has yet closed it end-to-end with physical fabrication. But the individual components exist and their integration is a matter of engineering, not of principle. Each cycle, once realised, would refine the model's latent understanding of the generative process that produced the material, building over time something we might describe as structural familiarity: increasingly accurate internal representations of the domain's causal dynamics, enabling more targeted and physically grounded proposals with each iteration (Zhang, Kraska and Khattab, 2025).

VIII. The Coding Agent Frontier

The same dynamics are unfolding in software engineering, providing independent confirmation that the generative process operates across domains.

Coding tasks blend the verifiable and the non-verifiable in ways that reveal insightfully. Compilation and test suites provide immediate feedback (the GPS city), but architectural decisions, design trade-offs, and the interpretation of ambiguous requirements inhabit the fog. A developer reading a vague GitHub issue, navigating a million-line codebase they have never seen, and proposing a coherent multi-file fix is performing a delayed-decidable task par excellence.

By 2026, AI coding agents are traversing this fog with increasing confidence. On the SWE-bench Verified benchmark (Jimenez et al., 2024), which requires agents to resolve real-world GitHub issues end-to-end (reading repositories, diagnosing problems, generating patches, and passing existing test suites) top systems, including Claude in agentic configurations and specialised frameworks, achieve resolution rates in the range of 72-80% as of early 2026. These agents succeed not by memorising solutions but by deploying the same metafeatures that human developers rely on: decomposition, analogy to similar codebases, anticipation of error patterns, and adherence to the conventions, standards, and naming semantics that humans deliberately designed for mutual comprehension.

The parallel to material science is direct. Human software engineers organised systems using conventions, protocols, and semantic structures chosen to be legible to other humans. This collective choice-making architecture, encoded in millions of repositories, pull requests, code reviews, and documentation pages, is precisely what LLMs have compressed. They have learned not just what code looks like but how humans decide what code to write: the scenario-driven reasoning that practitioners like Kaner (2003) have long articulated as the core of effective software quality assurance.

Yet the limits deserve honest treatment. Newer benchmarks that test long-horizon, evolving tasks, where the codebase changes across a sequence of dependent issues, show sharp performance drops, with resolution rates falling to roughly a third or less of standard benchmark performance. This is not merely "the frontier remains". Rather, it constitutes genuine evidence that the current depth of internalisation is insufficient for the most demanding scales of the generative process: sustained architectural reasoning, long-range dependency tracking, and the kind of radical design invention that characterises senior engineering. Van Rooij et al. (2024) would argue that some of these tasks may be computationally intractable, meaning that no amount of scaling will close the gap without fundamentally new approaches. One interpretation is that the limitation is quantitative (more training, larger context, better search will suffice). Another, which we cannot yet rule out, is that some aspects of long-horizon creative planning may require capacities (sustained working memory, embodied feedback loops, or forms of abstraction not yet captured) that current architectures do not possess. The honest position is that this remains an open empirical question, not a settled one in either direction.

IX. The Encoding Gap: The Human Cost of Automated Discovery

This revolution returns us to the critical risk that has shadowed the argument from the beginning.

As these systems become capable of traversing the friction of discovery (solving the proofs, characterising the process conditions, designing the alloys, writing the code) humans risk receiving the artifact without the understanding. We get the destination without the journey.

I have argued elsewhere (Vasa, 2025) that causal power, defined here as the capacity to predict the consequences of changes to a system, to adapt the system when conditions shift, and to extend it to genuinely novel contexts, comes from the internalised theory of why something works, not merely from the working object. If we rely entirely on the simulacrum, the AI's faithful map of the territory, we become operators of instruments rather than architects of intelligence. We risk a future in which we possess powerful technology but lack the deep, encoded understanding required to adapt, extend, or repair it when the model drifts beyond its training distribution.

The risks compound in ways that are difficult to perceive until it is too late. When things fail, when the error lies outside the boundary of what the machine can compensate for, we may find ourselves in a world where humans, too, can no longer compensate. This is the dystopia that speculative fiction has long anticipated: not the malevolent machine of popular imagination, but the atrophied humanity of Pixar's WALL-E, where comfort and capability have diverged so thoroughly that the species can no longer function without its prosthetics. The philosophical tradition echoes this warning. Heidegger (1977) cautioned against technology as Enframing (Gestell), a mode of revealing that reduces everything, including human beings, to standing reserve for optimisation. Ellul (1964) warned of technique as a self-reinforcing system that subordinates all human values to efficiency. In both cases, the danger is not the machine itself but the atrophy of the human capacities that the machine replaces.

The connection to the generative process is precise. If humans no longer perform the five stages, if we no longer perceive anomalies, select trajectories, express and externalise our reasoning, negate our own assumptions, and integrate feedback, then we lose not merely a skill but the substance of understanding itself. Causal power is not a static possession; it is the product of ongoing generative traversal. The encoding gap is not a gap in information; it is a gap in the generative capacity that produces understanding.

The solution is not to reject the machine but to cultivate positive friction. We must use these agents as scaffolds to reach higher levels of abstraction, not as replacements for the climb itself. The role of the human shifts. From the one who solves the equation to the one who sets the reward function. From the one who characterises the material to the one who interprets the anomalies found by the Breaker agent. From the one who writes the code to the one who evaluates whether the code embodies the right values and serves the right ends. The generative process must remain a human process, even as machines participate in it at ever-higher scales.

Conclusion: The World Building Machine

We stand at a juncture. We can continue to use AI to retrieve standard answers, fishing in the stagnant pond of known truths via forward models that only interpolate existing literature. Or we can build the World Building Machine.

This machine, composed of adversarial agent swarms, inverse design frameworks like NSGAN, physically grounded simulation loops, and coding agents that traverse million-line codebases, does not merely predict the future. It materialises it. It takes the delayed-decidable complexities of material science, theoretical physics, and software engineering and navigates the fog to find solutions that human effort alone could not reach.

The unifying insight is the structurally recursive generative process. LLMs have not just memorised the products of human thought; they have internalised the scale-invariant architecture by which human thought produces novelty. This is why delayed-decidable tasks yield to them: the fog was never structureless, only structured at a higher order than our traditional tools could perceive.

The simulacrum provides the compass. It is up to us to walk the territory, ensuring we close the encoding gap not by racing the machine but by using it as a scaffold for deeper human understanding.

Glossary of Key Terms

Pareto Frontier (Pareto Front): In multi-objective optimisation, the set of solutions for which no single objective can be improved without degrading at least one other objective. In materials design, this represents the boundary of achievable trade-offs, for example the maximum tensile strength achievable at each level of ductility. Solutions on the Pareto frontier are termed "non-dominated" because no other known solution is superior across all objectives simultaneously.

Delayed-Decidable: A task that currently lacks immediate objective feedback for evaluation but is not in principle undecidable. The feedback exists but arrives on a longer timescale, through expert review, experimental testing, market outcomes, or downstream consequences. Distinguished from genuinely undecidable tasks (halting problem, Gödelian limits, irreducibly subjective judgements) where no feedback will ever resolve the question.

Metafeatures: Higher-order regularities in human reasoning and decision-making that transcend specific domains. These correspond to the regulative principles of productive inquiry identified across traditions: parsimony (Peirce's economy of hypotheses), falsifiability (Popper's demand for negation), coherence (logical consistency with prior knowledge), and analogy (structural transfer from known to unknown domains).

Fractal Generative Process: The structurally recursive cycle of perception, selection, negation, expression, and integration that characterises human inquiry at every level, from word choice to paradigm shifts. "Fractal" is used in the qualitative sense of self-similar recurrence across scales, not in the strict mathematical sense of measurable fractal dimension. The process has been independently described in different vocabularies by Peirce (abduction-deduction-induction), Popper (conjecture-refutation), and others.

Encoding Gap: The loss of internalised causal understanding that occurs when humans receive AI-generated artifacts (solutions, designs, code) without traversing the generative process that produced them. The gap is not in information but in causal power: the capacity to predict consequences of changes, adapt systems to novel conditions, and extend them to new contexts.

References

Chollet, F. (2019) 'On the measure of intelligence', arXiv preprint arXiv:1911.01547.

Ellul, J. (1964) The Technological Society. Translated by J. Wilkinson. New York: Vintage Books.

Feynman, R.P. (1986) 'Appendix F: Personal observations on the reliability of the Shuttle', in Report of the Presidential Commission on the Space Shuttle Challenger Accident. Washington, D.C.: U.S. Government Publishing Office.

Ghafarollahi, A. and Buehler, M.J. (2025a) 'SciAgents: Automating scientific discovery through bioinspired multi-agent intelligent graph reasoning', Advanced Materials, 2413523. doi: 10.1002/adma.202413523.

Ghafarollahi, A. and Buehler, M.J. (2025b) 'SPARKS: Multi-agent artificial intelligence model discovers protein design principles', arXiv preprint arXiv:2504.19017.

Ghorbani, M., Li, Z., Pasquini, A., Vasa, R. and Birbilis, N. (2025) 'Supervised machine learning for corrosion assessment of multi-principal element alloys using experimental and generative datasets', npj Materials Degradation, 9, p. 155. doi: 10.1038/s41529-025-00700-9.

Heidegger, M. (1977) The Question Concerning Technology and Other Essays. Translated by W. Lovitt. New York: Harper & Row.

Hsu, S.D.H. (2025) 'Relativistic covariance and nonlinear quantum mechanics: Tomonaga-Schwinger analysis', Physics Letters B, 862, p. 140053.

Im, S., Oh, C., Fang, Z. and Li, S. (2026) 'How do transformers learn to associate tokens: Gradient leading terms bring mechanistic interpretability', in Proceedings of the Fourteenth International Conference on Learning Representations.

Jimenez, C.E., Yang, J., Wettig, A., Yao, S., Pei, K., Press, O. and Narasimhan, K. (2024) 'SWE-bench: Can language models resolve real-world GitHub issues?', in Proceedings of the Twelfth International Conference on Learning Representations.

Kambhampati, S., Valmeekam, K., Guan, L., Verma, M., Stechly, K., Bhambri, S., Saldyt, L.P. and Murthy, A.B. (2024) 'Position: LLMs can't plan, but can help planning in LLM-Modulo frameworks', in Proceedings of the 41st International Conference on Machine Learning. PMLR 235, pp. 22895-22907.

Kaner, C. (2003) An introduction to scenario testing. Florida Institute of Technology. Available at: https://kaner.com/pdfs/ScenarioIntroVer4.pdf.

Li, Z. and Birbilis, N. (2024) 'NSGAN: A non-dominant sorting optimisation-based generative adversarial design framework for alloy discovery', npj Computational Materials, 10, p. 81. doi: 10.1038/s41524-024-01294-7.

Marcus, G. (2020) 'The next decade in AI: Four steps towards robust artificial intelligence', arXiv preprint arXiv:2002.06177.

Marom, L. and Buehler, M.J. (2025) 'Frontiers of biological material intelligence', MRS Bulletin, 50, pp. 1492-1504. doi: 10.1557/s43577-025-00987-8.

Mitchell, M. (2019) Artificial Intelligence: A Guide for Thinking Humans. New York: Farrar, Straus and Giroux.

Ni, B., Kaplan, D.L. and Buehler, M.J. (2024) 'ForceGen: End-to-end de novo protein generation based on nonlinear mechanical unfolding responses using a language diffusion model', Science Advances, 10(6), p. eadl4000. doi: 10.1126/sciadv.adl4000.

Oppenheim, J. (2025) 'Nonlinear quantum mechanics and artificial intelligence', arXiv preprint arXiv:2512.07809.

Pasquini, A. (2024) 'Robust and lightweight modeling of IoT network behaviors from raw traffic packets', IEEE Transactions on Machine Learning in Communications and Networking, 3, pp. 98-116. doi: 10.1109/tmlcn.2024.3517613.

Peirce, C.S. (1992) The Essential Peirce: Selected Philosophical Writings, vol. 1. Edited by N. Houser and C. Kloesel. Bloomington: Indiana University Press.

Popper, K.R. (1963) Conjectures and Refutations: The Growth of Scientific Knowledge. London: Routledge.

Shanahan, M. (2024) 'Talking about large language models', Communications of the ACM, 67(2), pp. 68-79. doi: 10.1145/3624724.

Tao, T. (2025) 'The story of Erdos problem #1026', What's new [Blog]. Available at: https://terrytao.wordpress.com.

van Rooij, I., Guest, O., Adolfi, F.G., de Haan, R., Kolokolova, A. and Rich, P. (2024) 'Reclaiming AI as a theoretical tool for cognitive science', Computational Brain and Behavior, 7(4), pp. 616-636. doi: 10.1007/s42113-024-00217-5.

Vasa, R. (2025) 'The encoding gap', AI Engineering [Online]. Available at: https://aiengineering.leaflet.pub/3mdyrl5eiek22.

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L. and Polosukhin, I. (2017) 'Attention is all you need', in Advances in Neural Information Processing Systems, 30, pp. 5998-6008.

Zhang, A.L., Kraska, T. and Khattab, O. (2025) 'Recursive language models', arXiv preprint arXiv:2512.24601.

Beyond the encoding gap: Metafeatures, Swarms, and the Architecture of Discovery

The Residue and the Trace: What AI Cannot Extract from Expert Knowledge

AI Engineering

Thoughts on how to thrive in the age of vibe coding, using AI agents, how to engineer with and for AI.