Diligence Questions Every AI Investor Should Ask

Reading Time: 9 Minutes

Image created by Superhumxn team.


Cognition

Cognition has 15 engineers. Each one runs roughly five instances of Devin, the company's autonomous coding agent, simultaneously, assigning tasks through Slack, Linear, and GitHub the same way they'd hand work to a remote contractor. Scott Wu disclosed this on Lenny's Podcast in May 2025. It didn't come up in the $1 billion Series D announcement, which pushed Cognition's valuation to $26 billion and described, in the usual terms, market opportunity and product traction. The podcast host asked the right follow-up question. 

A standard term sheet discloses the round size, the valuation, the lead investor, and a sentence on what the company builds. None of those categories have a section for how many AI agents each engineer is supervising, or whether the company has even considered that question. Two companies can raise identical Series A rounds with identical headcounts, one with a clear agent-supervision model and one where agents are running informally with no accountability structure, and the heads of terms looks the same for both.


What can I expect from a subscription?

Real-world strategies for navigating the future of work.

  • AI AGENTS AT WORK

  • Everything you need to know about how AI agents are changing teams and ways of working.

  • SUPERHUMXN IRL

  • Real insights from investors and operators.

  • MEMOS ON PEOPLE, CULTURE & AI AT WORK

  • The latest use cases and strategies for change and transformation.


Ratio

Bricklayers to Architects

Each Cognition engineer doesn't supervise one agent the way a manager might supervise one report. They run a small team of Devins simultaneously, assigning tasks through Slack, Linear, and GitHub the same way they'd hand off work to a remote contractor. As of Wu's Lenny's Podcast interview in May 2025, those agents were producing about 25 percent of Cognition's pull requests, with the company targeting 50 percent by the end of that year. A more recent account, reported by Tech Times in late May 2026, put Devin's merge rate across Cognition's broader customer base at 67 percent of submitted pull requests, up from 34 percent the year before.

Image Source: Scalable

Devin's planning interface. Instead of assigning work to individual engineers, companies can now assign tasks directly to software agents that plan, execute and submit work for review.


Those two numbers, the ratio and the merge rate, are doing different jobs. The ratio tells you about org design, how many active processes a single human is responsible for supervising at once. The merge rate tells you about output quality, how much of what those processes produce is actually good enough to ship. A company could have a high ratio and a low merge rate, which would mean agents producing large volumes of work that rarely gets approved or shipped. Cognition's combination, a meaningful ratio alongside a rising merge rate, is the more interesting case, because it suggests the supervision model is actually working rather than just generating volume.

Wu describes the shift as engineers moving from "bricklayers to architects," with humans focused on high-level design while agents handle implementation. At Cognition, that framing is already a hiring requirement. This could be managing a small group of systems that write code, evaluating which outputs are worth keeping, catching the ones that aren't, deciding when to hand something back to a human rather than let the agent keep iterating. Many engineering interviews aren't screening for any of it, which is part of why Sierra rebuilt its entire process.

Each Devin session runs in an isolated sandbox, a fresh virtual machine with its own shell, browser, and code editor. Nothing reaches a live production environment without a human explicitly approving it first. Five agents running in parallel means five streams of proposed work to review before anything ships, each gated at the same point a colleague's draft would be, with higher volume but the same basic motion. Whether that holds at 1-to-20, or eventually 1-to-50, depends on how much the approval step can be compressed without losing the accountability it currently provides, and that's probably the more consequential number to track than the ratio itself once enough companies are running agents at scale.

Image Source: Databricks

The emerging role many organisations have not yet formally defined: humans supervising, coordinating and validating the work of multiple AI agents rather than completing every task themselves.


Growth

The headcount curve

Legora went from 40 employees to roughly 400 in under twelve months, according to CEO Max Junestrand in a recorded interview with Pigment co-founder Eléonore Crespo, crossing more than 40 legal markets along the way and hitting 300 percent net revenue retention in 2025. A separate Y Combinator interview puts the company closer to 500 people and past $100 million in annual recurring revenue by mid-2026, spread across offices in San Francisco, Chicago, New York, London, Stockholm, Germany, India, and Australia.

Junestrand has said the market expansion was the manageable part. Keeping the organisation consistent while headcount grew tenfold in a year was harder. The hiring filter is talented and passionate people only, with "no assholes" allowed in. In a separate interview with Artificial Lawyer, Junestrand described favouring "missionaries over mercenaries," a phrase borrowed from venture capitalist John Doerr, adding that every candidate is personally interviewed using what he calls "brutal questions" to test whether they want the demanding version of the job rather than an easier one elsewhere.

Image Source: McKinsey

The next generation of org charts will need to show more than people. As AI agents take on larger portions of operational work, investors will increasingly need to understand who supervises them, where decisions are made, and how accountability flows through the organisation.

According to Business Insider, staff have dinner in the office at 8pm as a matter of course, the company has closed deals on New Year's Eve, and at a recent Christmas dinner staff were served mulled wine next to a live sales dashboard that people kept checking throughout the evening. Legora competes directly with Harvey, a San Francisco-based rival valued at $8 billion, more than triple Legora's valuation at the time of the last raise. The pace Junestrand describes is a direct response to that.

Cognition and Legora are both described as AI-native companies. In the same eighteen-month window, Cognition's human headcount barely moved while output expanded through agents. Legora hired aggressively because its legal services still require a geographically distributed sales and delivery presence. For Cognition, knowing there are 15 engineers tells you almost nothing about the team’s range of capabilities.


Diligence deviation

Investors evaluating AI companies in 2026 are increasingly asking about AI governance, data residency, and how AI might disrupt the target's own market. Bain's diligence framework covers AI's impact as a revolution, transformation, or augmentation risk. For AI-agent startups specifically, investors are beginning to ask for task completion rates and human-in-the-loop percentages. What none of these frameworks include is how many agents does each engineer supervise, and who is accountable when one gets something wrong. 

Revenue-per-employee, the metric most acquirers reach for instinctively, starts to break down once a meaningful share of output isn't being produced by an employee at all. A 15-person engineering team running agents alongside their own work looks identical on a headcount table to a 15-person team that isn't.


Accountability

Wu’s framing of engineers moving from bricklayers to architects doesn't necessarily extend to what happens when one of the bricklayers makes a mistake and the architect was watching four others at the time.

In a traditional code review, there is a name on the pull request and a name on the approval. When something goes wrong, the postmortem has somewhere to start. At Cognition, the same engineer reviewing five parallel streams of agent output is making five simultaneous judgments about whether each output is trustworthy enough to ship. That's a different cognitive task than reviewing a colleague's work, and the failure mode when it goes wrong is different too. The flaw is systemic. 

This isn't an argument against the model, but there is a strong case for refining what the supervision role really requires, and designing hiring, training, and incident review processes around that.


Diligence

A simple checklist

Agent-to-engineer ratio. 

How many agents does each engineer supervise, and how has that changed over the last two quarters? Cognition's answer: 15 engineers, roughly five Devins each, with agent output rising from 25 percent in May 2025 to 67 percent by May 2026.

Accountability. 

When an agent produces a problem, who owns it? Ideally this will be a specific person or role rather than a department.

Headcount in context. 

What share of output is coming from agents versus people? A 15-person team running agents at scale can be producing as much as a 60-person team that isn't. The headcount figure alone won't tell you which you're looking at.


Next

Standardised metrics 

In May 2025, Wu described one engineer to five Devins. Just 10 months later in March 2026, he said Cognition engineers had stopped writing code entirely. Standard M&A diligence is a snapshot taken at signing, but Skadden's January 2026 guidance on AI acquisitions notes that AI model performance can change significantly in a short span of time. Buyers are increasingly structuring earnouts tied to AI performance metrics rather than relying on point-in-time assessments. McKinsey's January 2026 M&A report goes further, estimating that within two years diligence will need to become a continuous and connected part of the deal cycle rather than a discrete phase. The agent-to-engineer ratio needs to be tracked across quarters, not assessed once in a data room.



Cara Eli

Cara is a London-based writer and qualified HR pro who has spent the last decade working with global brands like Amazon and Richemont. She now writes about the future of work.

Next
Next

The New Rules Of Hiring