What fully automated firms will look like
Everyone is sleeping on the *collective* advantages AIs will have, which have nothing to do with raw IQ: they can be copied, distilled, merged, scaled, and evolved in ways humans simply can't.
Developed in collaboration with Ege Erdil, Tamay Besiroglu, and Gavin Leech.
Epistemic status: Shooting-the-shit; 25% sure this roughly describes how firms of AGIs will actually work.
Even people who expect human-level AI soon are still seriously underestimating how different the world will look when we have it. Most people are anchoring on how smart they expect individual models to be. (i.e. they’re asking themselves “What would the world be like if everyone had a very smart assistant who could work 24/7?”.)
Everyone is sleeping on the collective advantages AIs will have, which have nothing to do with raw IQ but rather with the fact that they are digital—they can be copied, distilled, merged, scaled, and evolved in ways human simply can’t.
What would a fully automated company look like - with all the workers, all the managers as AIs? I claim that such AI firms will grow, coordinate, improve, and be selected-for at unprecedented speed.
This essay is not a prediction of what GPT-5 will be doing, nor about emulations of existing humans. Rather, I'm trying to imagine what the world will look like once we actually have AGIs - the descendants of LLMs that have gotten so good that they can do basically anything any human can do.
Copy
Currently, firms are extremely bottlenecked in hiring and training talent. But if your talent is an AI, you can copy it a stupid number of times. What if Google had a million AI software engineers? Not untrained amorphous "workers," but the AGI equivalents of Jeff Dean and Noam Shazeer, with all their skills, judgment, and tacit knowledge intact.
This ability to turn capital into compute and compute into equivalents of your top talent is a fundamental transformation. Since you can amortize the training cost across thousands of copies, you could sensibly give these AIs ever-deeper expertise - PhDs in every relevant field, decades of business case studies, intimate knowledge of every system and codebase the company relies on.
The power of copying extends beyond individuals to entire teams. Small previously successful teams (think PayPal Mafia, early SpaceX, the Traitorous Eight) can be replicated to tackle a thousand different projects simultaneously. It's not just about replicating star individuals, but entire configurations of complementary skills that are known to work well together. The unit of replication becomes whatever collection of talent has proven most effective.
Copying will transform management even more radically than labor. It will enable a level of micromanagement that makes founder mode look quaint. Human Sundar simply doesn't have the bandwidth to directly oversee 200,000 employees, hundreds of products, and millions of customers. But AI Sundar’s bandwidth is capped only by the number of TPUs you give him to run on. All of Google’s 30,000 middle managers can be replaced with AI Sundar copies. Copies of AI Sundar can craft every product’s strategy, review every pull request, answer every customer service message, and handle all negotiations - everything flowing from a single coherent vision.
There is no principal-agent problem wherein employees are optimizing for something other than Google’s bottom line, or simply lack the judgment needed to decide what matters most.1 A company of Google's scale can run much more as the product of a single mind—the articulation of one thesis—than is possible now.2
Merge
Think about how limited a CEO's knowledge is today. How much does Sundar Pichai really know about what's happening across Google's vast empire? He gets filtered reports and dashboards, attends key meetings, and reads strategic summaries. But he can't possibly absorb the full context of every product launch, every customer interaction, every technical decision made across hundreds of teams. His mental model of Google is necessarily incomplete.
Now imagine mega-Sundar – the central AI that will direct our future AI firm. Just as Tesla's Full Self-Driving model can learn from the driving records of millions of drivers, mega-Sundar might learn from everything seen by the distilled Sundars - every customer conversation, every engineering decision, every market response.
Unlike Tesla’s FSD, this doesn’t have to be a naive process of gradient updating and averaging. Mega-Sundar will absorb knowledge far more efficiently – through explicit summaries, shared latent representations, or even surgical modification of the weights to encode specific insights.
The boundary between different AI instances starts to blur. Mega-Sundar will constantly be spawning specialized distilled copies and reabsorbing what they’ve learned on their own. Models will communicate directly through latent representations, similar to how the hundreds of different layers in a neural network like GPT-4 already interact.3 So, approximately no miscommunication, ever again. The relationship between mega-Sundar and its specialized copies will mirror what we're already seeing with techniques like speculative decoding – where a smaller model makes initial predictions that a larger model verifies and refines.
Merging will be a step change in how organizations can accumulate and apply knowledge. Humanity's great advantage has been social learning – our ability to pass knowledge across generations and build upon it. But human social learning has a terrible handicap: biological brains don't allow information to be copy-pasted. So you need to spend years (and in many cases decades) teaching people what they need to know in order to do their job. Look at how top achievers in field after field are getting older and older, maybe because it takes longer to reach the frontier of accumulated knowledge. Or consider how clustering talent in cities and top firms produces such outsized benefits, simply because it enables slightly better knowledge flow between smart people.
Future AI firms will accelerate this cultural evolution through two key advantages: massive population size and perfect knowledge transfer. With millions of AI worker, automated firms get so many more opportunities produce innovations and improvements, whether from lucky mistakes, deliberate experiments, de-novo inventions, or some combination.
As Joseph Henrich explains in The WEIRDest People in the World,
cumulative cultural evolution—including innovation—is fundamentally a social and cultural process that turns societies into collective brains. Human societies vary in their innovativeness due in large part to the differences in the fluidity with which information diffuses through a population of engaged minds and across generations
Historical data going back thousands of years suggest that population size is the key input for how fast your society comes up with more ideas. AI firms will have population sizes that are orders of magnitude larger than today's biggest companies - and each AI will be able to perfectly mind meld with every other, from the bottom to the top of the org chart.
AI firms will look from the outside like a unified intelligence that can instantly propagate ideas across the organization, preserving their full fidelity and context. Every bit of tacit knowledge from millions of copies gets perfectly preserved, shared, and given due consideration.
Scale
The cost to have an AI take a given role will become just the amount of compute the AI consumes. This will change our understanding of which roles are scarce.
Future AI firms won’t be constrained by what's scarce or abundant in human skill distributions – they can optimize for whatever abilities are most valuable. Want Jeff Dean-level engineering talent? Cool: once you’ve got one, the marginal copy costs pennies. Need a thousand world-class researchers? Just spin them up. The limiting factor isn't finding or training rare talent – it's just compute.
So what becomes expensive in this world? Roles which justify massive amounts of test- time compute. The CEO function is perhaps the clearest example. Would it be worth it for Google to spend $100 billion annually on inference compute for mega-Sundar? Sure! Just consider what this buys you: millions of subjective hours of strategic planning, Monte Carlo simulations of different five-year trajectories, deep analysis of every line of code and technical system, and exhaustive scenario planning.
Imagine mega-Sundar contemplating: "How would the FTC respond if we acquired eBay to challenge Amazon? Let me simulate the next three years of market dynamics... Ah, I see the likely outcome. I have five minutes of datacenter time left – let me evaluate 1,000 alternative strategies."
The more valuable the decisions, the more compute you'll want to throw at them. A single strategic insight from mega-Sundar could be worth billions. An overlooked risk could cost tens of billions. However many billions Google should optimally spend on inference for mega-Sundar, it's certainly more than one.
Distillation
What might distilled copies of AI Sundar (or AI Jeff) be like? Obviously, it makes sense for them to be highly specialized, especially when you can amortize the cost of that domain specific knowledge across across all copies. You can give each distilled data center operator a deep technical understanding of every component in the cluster, for example.
I suspect you’ll see a lot of specialization in function, tacit knowledge, and complex skills, because they seem expensive to sustain in terms of parameter count. But I think the different models might share a lot more factual knowledge than you might expect. It’s true that plumber-GPT doesn’t need to know much about the standard model in physics, nor does physicist-GPT need to know why the drain is leaking. But the cost of storing raw information is so unbelievably cheap (and it’s only decreasing) that Llama-7B already knows more about the standard model and leaky drains than any non-expert. If human-level intelligence is more than 1 trillion parameters, is it so much of an imposition to keep around what will, at the limit, be much less than 7 billion parameters to have most known facts right in your model? (Another helpful data point here is that “Good and Featured” Wikitext is less than 5 MB. I don’t see why all future models—except the esoteric ones, the digital equivalent of tardigrades—wouldn’t at least have Wikitext down.4
Evolve
The most profound difference between AI firms and human firms will be their evolvability. As Gwern Branwen observes:
Why do we not see exceptional corporations clone themselves and take over all market segments? Why don’t corporations evolve such that all corporations or businesses are now the hyper-efficient descendants of a single ur-corporation 50 years ago, all other corporations having gone extinct in bankruptcy or been acquired? Why is it so hard for corporations to keep their “culture” intact and retain their youthful lean efficiency, or, if avoiding “aging” is impossible, why [not] copy themselves or otherwise reproduce to create new corporations like themselves?
His answer:
Corporations certainly undergo selection for kinds of fitness, and do vary a lot. The problem seems to be that corporations cannot replicate themselves … Corporations are made of people, not interchangeable, easily copied widgets or strands of DNA .. The corporation may not even be able to “replicate” itself over time, leading to scleroticism and aging.
The scale of difference between currently existing human firms and fully automated firms will be like the gulf in complexity between prokaryotes and eukaryotes. Prokaryotes like bacteria are not only remarkably simple, but have barely changed over their 3 billion year history. Whereas eukaryotes rapidly scaled up in complexity, and gave rise to all the other astonishing organisms with trillions of cells working together tightknit5
This evolvability is also the key difference between AI and human firms. As Gwern points out, human firms simply cannot replicate themselves effectively - they're made of people, not code that can be copied. They can't clone their culture, their institutional knowledge, or their operational excellence. AI firms can6.
If you think human Elon is especially gifted at creating hardware companies, you simply can’t spin up 100 Elons, have them each take on a different vertical, and give them each $100 million in seed money. As much of a micromanager as Elon might be, he’s still limited by his single human form. But AI Elon can have copies of himself design the batteries, be the car mechanic at the dealership, and so on. And if Elon isn’t the best person for the job, the person who is can also be replicated, to create the template for a new descendant organization.
Takeover
So then the question becomes: If you can create Mr. Meeseeks for any task you need, why would you ever pay some markup for another firm, when you can just replicate them internally instead? Why would there even be other firms? Would the first firm that can figure out how to automate everything will just form a conglomerate that takes over the entire economy?
Ronald Coase’s theory of the firm tells us that companies exist to reduce transaction costs (so that you don’t have to go rehire all your employees and rent a new office every morning on the free market). His theory states that the lower the intra-firm transaction costs, the larger the firms will grow. Five hundred years ago, it was practically impossible to coordinate knowledge work across thousands of people and dozens of offices. So you didn’t get very big firms. Now you can spin up an arbitrarily large Slack channel or HR database, so firms can get much bigger.
AI firms will lower transaction costs so much relative to human firms. It’s hard to beat shooting lossless latent representations to an exact copy of you for communication efficiency! So firms probably will become much larger than they are now.
But it’s not inevitable that this ends with one gigafirm which consumes the entire economy. As Gwern explains in his essay, any internal planning system needs to be grounded in some kind of outer "loss function" - a ground truth measure of success. In a market economy, this comes from profits and losses.
Internal planning can be much more efficient than market competition in the short run, but it needs to be constrained by some slower but unbiased outer feedback loop. A company that grows too large risks having its internal optimization diverge from market realities.
That said, the balance may shift as AI systems improve. As corporations become more "software-like" - with perfect replication of successful components and faster feedback loops - we may see much larger and more efficient firms than were previously possible.
The market continues to serve as the grounding outer loop. How does the firm convert trillions of tokens of data from customers, markets, news, etc every day into future plans, new products, and the like? Does the board make all the decisions politburo-style and use $10 billion dollars of inference to run Monte Carlo tree search on different one-year plans? Or do you run some kind of evolutionary process on different departments, giving them more capital, and compute/labor based on their performance?
These are all what we would today call “culture.” Markets facilitate an evolutionary process which selects not only goods and services, but the institutions that are best at turning the world into valuable goods and services. I think this will continue.
…except for the biggie: the intensified principal-agent problem between the CEO-workers (who suddenly know everything) and the shareholders on the outside (who know as much as they know now, i.e. roughly nothing).
Since labor is trivial to copy and spin up, the value of intellectual property will go up. The essence of the firm basically becomes intellectual property. GM can poach as many Tesla engineers as it wants (or, in our hypothetical, clone someone with equivalent skills). But without intellectual theft, they can’t get the FSD model or the millions of hours of driving it was trained on. If firms no longer have a moat in labor, their moat will be this kind of industry-specific knowledge and data.
An intriguing thought, in response to things like Meta’s Continuous Chain of Thought models, is that actually we will want to keep communication happening in discrete tokens, since this is a kind of autoencoding - a bottleneck which makes you usefully compress information into actual insight when you communicate. (Leaders have to force people to keep things simple, since in general people want to sound smart instead.) But these tokens don’t need to relate to natural language necessarily.
There’s a biological analogy here, too. Every cell in your body stores each base pair of your 3GB DNA, despite the fact that only 10 to 20 percent of the protein coding regions are expressed in any particular cell, and only 1 percent of your DNA is protein coding in the first place, with much of the rest long thought of as “junk”. Information is apparently just so cheap to store that there’s been little selective pressure against this redundancy and waste.
Big changes in evolvability don’t have to involve too much complexity or mutation. Two reasons bacteria didn’t evolve much compared to eukaryotes is that they performed energy production on their surface which scales poorly with cell volume (compared to eukaryotes which solved this with mitochondria); and because they had a constraint on their genome size (because they compete with each other by replicating faster, and replication is mainly bottlenecked on DNA length).
If I'm being frank, I think this is ridiculous, and scary. Efficiency is not everything. I'm probably going to sound like a Luddite here, but AI should be used for scientific & material advancement, not to create mega-Sundars. The Silicon Valley tech narrative of efficiency seems dystopian. It also ignores how much of the modern world is a result of consumer demand. So many innovations, it is because there was consumer demand. When you replace a Google engineer, you might increase Google's bottom-line, but you also replace a source of demand for rest of the economy. And considering Google is dependent on advertising revenue from other corporations, Google is indirectly hurting its own bottom-line too. Sure a world with AI firms might distribute the surplus back to human beings, but where does demand even come from this world?, and if we can find a way, do we want this? Work also provides meaning, lessons to us, as human being. It grounds us as human beings. It's what makes us human. And the world you are describing does not have a place for most humans. Maybe this happens, but I hope it doesn't. We need norms to deal with this.
As a shareholder of Alphabet, I welcome mega-Sundar and his copies. But it really raises the question: What are most of us going to do with our time?
Are governments going to be ready to deal with such transformative change? It's kind of insane how little discussion there is on creating frameworks for different AGI scenarios. I guess everyone is too busy building AI and/or implementing it in their day-to-day lives.