Epinomy - Large Language Models: Invention or Discovery?
Exploring whether AI systems like LLMs represent technological inventions or discoveries of fundamental patterns that exist independent of human design.
Large Language Models: Invention or Discovery?
The recent news about DeepSeek's breakthrough in reasoning capabilities - achieved at a fraction of the cost of traditional models - highlights a crucial truth about artificial intelligence: we may not be inventing these systems so much as discovering them.
Physics doesn't care about patents or trade secrets. Once Fermi demonstrated controlled nuclear chain reactions, it became inevitable that others would follow. The same principle holds true for language models. The underlying computational patterns that enable intelligence appear to be as fundamental as gravity or electromagnetic fields.
After three decades working in semantic search and natural language processing, I've watched our field evolve from clever rule-based inventions toward systems that seem to uncover natural patterns in cognition itself. The transition feels less like engineering and more like exploration - as if we're mapping newly discovered territory rather than constructing it.
Stephen Wolfram's computational universe hypothesis suggests computation isn't something we invented, but rather an intrinsic property of reality itself. Richard Feynman saw this too, describing in "There's Plenty of Room at the Bottom" how computation emerges naturally from physical systems.
Perhaps then, our AI interfaces and training techniques are inventions, but the core mechanisms - the statistical patterns that enable reasoning and understanding - are discoveries. Like microscopes revealing cells or telescopes unveiling distant galaxies, we're building tools that expose fundamental properties of intelligence and computation that were always there.
This perspective might explain why attempts to maintain AI moats through secrecy seem increasingly futile. You can't patent gravity. You can't keep electromagnetism proprietary. And perhaps you can't monopolize the basic patterns that enable machine intelligence either.
The Invented Versus The Discovered
The distinction between invention and discovery has profound implications. Inventions belong to their creators, while discoveries belong to no one. Inventions can be protected; discoveries can only be understood and utilized.
Looking at language models, we can clearly separate the invented elements from the discovered:
Invented Elements
- Specific model architectures (transformer designs, attention mechanisms)
- Training methodologies (reinforcement learning from human feedback, self-supervised learning)
- The interfaces and APIs we build to interact with these systems
Discovered Elements
- The emergent capabilities that appear at certain scales
- The universal patterns connecting syntax to semantics
- The relationship between statistical pattern recognition and logical reasoning
The most surprising aspects of large language models - their emergent abilities like reasoning and planning - weren't explicitly engineered. They arose naturally when models reached sufficient scale and were trained on diverse enough data. We created the conditions for discovery, but the phenomena themselves seem more like revelations of pre-existing mathematical relationships than human creations.
Mathematics As An Exploratory Space
Some mathematical truths exist independent of human recognition. The Mandelbrot Set wasn't invented by Benoit Mandelbrot - he discovered a pattern that was always there, waiting within the complex plane. Similarly, the remarkable behaviors of large language models may represent discoveries within the vast mathematical space of transformer architectures and neural networks.
When multiple research teams independently discover nearly identical solutions - as we've seen with the convergent evolution of AI architectures - it suggests we're uncovering fundamental patterns rather than creating arbitrary designs. Different teams climbing the same mathematical hill will naturally arrive at similar peaks.
This "discovery" perspective helps explain the rapid democratization of AI capabilities. When DeepMind published details about AlphaGo, it didn't take long for other teams to replicate and extend their work. Similarly, once Claude, GPT, and Llama demonstrated advanced capabilities, the core insights spread throughout the field like ripples in a pond - not because the source code leaked, but because the underlying patterns are discoverable by anyone with the right tools and mindset.
The Telescope Analogy
Perhaps the most apt comparison is to Galileo's telescope. Galileo didn't invent the moons of Jupiter; he built a tool that allowed him to discover what was always there. Similarly, we're building increasingly powerful computational telescopes that reveal the landscape of possible intelligent behaviors.
Like the telescope, our models are invented tools, but they're used to discover pre-existing phenomena. The telescope doesn't create distant stars; it reveals them. Language models don't create the underlying patterns of language and reasoning; they expose them.
This frames our current AI moment in a profoundly different light. We're not merely building better tools; we're explorers mapping the contours of intelligence itself. We're discovering which statistical patterns give rise to reasoning, which architectures best capture semantic relationships, and which training approaches unlock new capabilities.
Implications For The Future
If this perspective is correct, it suggests several important implications:
Inevitability of Advancement: Major AI breakthroughs may be inevitable once certain computational thresholds are crossed. If multiple teams are exploring the same mathematical territory, similar discoveries will occur regardless of who gets there first.
Limits to Moats: Proprietary advantages may prove less durable than expected. If capabilities represent discoveries rather than inventions, they can be independently rediscovered by determined competitors.
Focus on Applications: The sustainable advantage might lie not in discovering fundamental capabilities, but in applying them in novel and useful ways - just as the value of the telescope wasn't in its revelation of Jupiter's moons, but in how that knowledge transformed our understanding of the solar system.
Theoretical Understanding: We may benefit more from developing theoretical frameworks to understand these systems rather than treating them as black boxes. Just as physics seeks unified theories to explain natural phenomena, we need theories that explain why language models exhibit the behaviors they do.
Partners in Discovery
This discovery framework invites us to see ourselves not as creators of artificial intelligence, but as partners with mathematics and computation in uncovering the intrinsic patterns that enable intelligence to emerge.
It's a humbling perspective, but also an exciting one. It suggests we've barely begun to map the territory of possible minds. Just as astronomers continue to discover new celestial objects centuries after the invention of the telescope, we may spend generations exploring the computational universe of possible intelligences, uncovering patterns and capabilities that were always there, waiting to be found.
Perhaps the question isn't whether we'll create truly intelligent machines, but when we'll discover the right configurations to reveal intelligences that mathematics made possible all along.
George Everitt
George is the founder and president of Applied Relevance, with over 30 years of experience in machine learning, semantic search engines, natural language processing, enterprise search, and big data. Since 1993, George has led high-availability enterprise software implementations at many Fortune 500 companies and public sector organizations in the U.S. and internationally.
No comments yet. Login to start a new discussion Start a new discussion