An AI-native agency ships a brand-install, not a team. Everything else is org chart.

May 22, 2026

by Ygor Fonseca, Founder & Systems Lead

Two services firms run the same kind of engagement for the same kind of brand. Both have a senior lead on point, a junior on production, an AI system in the loop. Both have a Slack channel, a shared workspace, a weekly call. Both publish to the brand's social and email queues. Both invoice the same amount.

Three months in, the first firm hands the brand a folder. Inside the folder is a plain text file — a couple of pages long, named for the brand — plus a directory of smaller files alongside it. The lead on the engagement opens that folder on Monday morning, the AI system reads it, and the work starts. Every meeting, every decision, every brand voice exception, every learning from a campaign that didn't land — all of it is in the folder. The second firm hands the brand a year-end summary deck. The deck describes what they did. The folder, if it ever existed, lived in the head of the senior person on the account and walked out the door when she did.

This post is about why that folder is the actual deliverable in an AI-native services engagement — and what's inside it when the engagement is built right. We call the method Codified Engagement: every decision, rule, constraint, and learning from the work gets written into a brand-install that the AI system reads on every task. The brand-install is the agency.

What goes in an ideal brand-install

A brand-install is not a brief and not a wiki. It's a small, opinionated, machine-readable bundle that says: here is what this brand is trying to do, here are the rules the AI must follow when it works on this brand, here are the tasks the AI knows how to handle, here is what we have learned that the AI needs to remember, and here are the things the AI is not allowed to do without a human stopping it first.

The structure has five components. They go in the same order every time.

1. The working hypothesis. Three sentences at the top of the install. What this brand is trying to do, why now, and what the team thinks is blocking it. The hypothesis updates weekly. A hypothetical example: a direct-to-consumer supplement brand at low-eight-figure revenue is trying to break a higher revenue ceiling without raising what it costs to acquire each new customer on paid social; the lead's current read is that bottom-of-funnel paid is saturated and the retention loop is leaking. Three sentences. Specific to this brand. Written by a human after every weekly review and stamped with the date.

The hypothesis is the first thing the AI reads on every task. It is the answer to the question "what are we even trying to do for this brand right now?" Without it, every task starts cold and the AI's output drifts toward generic. With it, every task starts pointed at the same goal.

2. The locked rules. Brand voice constraints, off-limits topics, regulatory boundaries, the founder's standing preferences, the language conventions the brand uses for its own categories. These don't change daily. They are the constants the AI reads on every task before doing anything else.

The locked rules are the part nobody publishes. Architecturally, this is the brand voice in machine-readable form, accumulated over time, written precisely enough that the AI can apply it to a draft without a human re-explaining the brand on every brief. The shape — a section in the install, sectioned by what the rule covers, each rule traceable to the moment it was set — is shareable. The catalog itself is the part that earns the work over months and is exactly the part that walks out the door with the senior person on the account when the agency doesn't run on an install.

3. The instructions index. A list of the smaller files alongside the install, each of which tells the AI how to handle one kind of recurring task. We call these instructions files. One per recurring task — write a competitive brief, draft a launch email, propose a headline, run a pre-publish check on a long-form post, pull a weekly content performance pass. Each file has a clear trigger (when to use it), a scope (what it covers and what it doesn't), the structural moves the task uses, and examples of past instances that landed.

The index is just pointers — a few lines per entry, the file name, the trigger sentence, a one-sentence scope note. The full instructions live in their own files in the directory alongside. The index lets the AI route the task: this is a launch email, so load the launch-email instructions file; this is a competitive brief, so load the competitive-brief instructions file. The instructions compound. A new engagement adds a handful of new ones every month. After six months, the directory is the agency's living methodology, written for an AI to run and a human to review.

Andrej Karpathy — the AI researcher, former Tesla AI Director and an OpenAI founding team member — has the cleanest engineering name for what this is. In his Software 3.0 framing, the building blocks of AI-native systems are prompts, context, tools, examples, memory, and instructions. The brand-install is, in his vocabulary, exactly a Software 3.0 install: the working hypothesis is the context, the locked rules are the instructions, the instructions index is the tools list, the memory section (next) is the memory, and the prompts come from the senior lead's questions when she sits down to do the work. The six primitives map onto the install's components one-to-one.

Garry Tan, YC's CEO, has been publishing a related architecture at individual scale — a thin runtime he calls a harness that reads from a directory of detailed instructions files. His phrasing is Fat skills. Fat code. Thin harness. The shape is the same as Codified Engagement; the difference is scope. Tan runs the architecture for his own work. We run the architecture per brand we serve.

4. The memory index. The memory section is for things the team has learned about this brand that aren't derivable from the brand's website, the brand's data, or the brand's product itself. The reason a particular kind of subject line keeps converting. The founder's standing position on a topic that comes up across pieces. The result of an experiment that nobody else on the team remembers. The reason a tone the brand once used got walked back after a customer complaint.

Like the instructions, memory entries live in their own small files and the index just lists them — name, one-sentence summary, date set. Each memory entry has a reason field. The reason is the part that matters. A rule without a reason gets followed past the point where it makes sense. A rule with a reason can be re-evaluated when the context shifts.

The memory compounds the way the instructions do. A new memory gets added every time the team learns something they don't want to relearn. Six months in, the memory section is the running ledger of human judgment about this brand. It's the part that turns the install from a brief into an asset.

5. The kill switch. The last section of the install is the list of things the AI is not allowed to do without a human approving first. Never publish. Never send a campaign to more than a small list. Never spend over a specified threshold on paid. Never reply to a customer in the brand's voice without a human approving. Never change the locked rules. Never overwrite the install.

The kill switch is the discipline that makes the rest of the install deployable. Anthropic, in its published guidance on agent design, recommends using the simplest workflow that solves the problem and reaching for fuller dynamic-direction (what they call agents) only when the work structurally requires it. The kill switch is where that discipline lives inside the install itself, not in a policy doc that nobody loads. The AI can do plenty of work on its own. The decisions the brand pays the agency to get right are the ones that pass through a human first. (How the kill switch works once an agent can act on a brand's accounts — scoped access, an audit log, approval gates, and the one-command revoke the brand holds — is its own deep dive.)

How the install gets sharper

Every client conversation updates the install. The Friday review pass reads it, the Monday brief loads it, the Wednesday sprint uses it as the source of truth. If an important decision happens outside the install — verbally, in Slack, in a meeting, in a thread that nobody writes back — the install needs to be updated before the next task runs, so the next task starts from current context. If the update lags or something is missing, the kill switch is the backstop: the guardrails hold the line on what can ship without a human, so a stale install slows the work rather than letting a bad outcome through. The discipline is the same discipline a high-functioning engineering team uses on a repository: if the change isn't in the install, it isn't real yet.

The mechanism is what we covered yesterday in the install-order companion piece: the AI proposes, the human approves. After a senior review, the AI reads the review notes and surfaces proposed edits to the locked rules, proposed new memory entries, proposed updates to instructions files. The human approves, modifies, or rejects each one. The approved edits land in the install with a date and a reason. The rejected ones get a one-line note on why. The proposal-and-approval cycle is what turns scattered judgment into a compounding catalog.

The Friday pass is the weekly self-improvement loop. The AI proposes the edits that would have caught this cycle's corrections earlier; approved edits land before Monday; the next week starts from a sharper install. That is what self-improving looks like inside the install — not autonomous, just compounding.

Where the install breaks

Three failure modes show up most often.

The first is no write-back. The lead reviews work, makes corrections in the moment, and never updates the install. The same mistake recurs on the next task because the rules and the memory didn't change. The fix is procedural: no correction counts until the install is updated. If the team can't get to it in the moment, it goes on a list and gets handled on the Friday pass.

The second is the install becoming the deck. Teams under deadline reach for the install when a buyer asks for a sample of the work, copy a few sections into a slide, and then start writing for that slide instead of writing for the work. The install gets prettier, the instructions get vaguer, and eventually nobody can run an actual task from it. The fix is that the install is read by the AI, not by buyers. Buyers can see the shape — what the components are, how the install is structured, what it does — but the catalog stays where it works.

The third is the kill switch quietly getting smaller. Under pressure, the team raises a threshold, removes a constraint, lets the AI ship one thing without approval to save time, and then doesn't put the constraint back. Six weeks later the gate isn't doing what it was set up to do, and a piece of brand work goes out that shouldn't have. The fix is to log every kill-switch change with a reason and a date, and have the monthly review pass read those changes specifically.

The argument

The argument is direct: in an AI-native services engagement, the brand-install is the agency. Everything else — the team, the Slack channel, the deck, the QBR — is the organization chart around it. The install is the part that compounds, the part the AI actually reads, the part that determines whether the work the brand pays for gets sharper over time or stays at whatever shape it had on day one.

The reason this is publishable as a method and not just an internal practice is that the shape is the marketing. A buyer reading this post can recognize what's missing from a vendor that pitches "AI-native delivery" but doesn't have an answer to "where does the AI read from?" If the answer is "we use AI inside our tools," the install is invisible and the catalog isn't compounding. If the answer is "every engagement has a brand-install with these five components — here are the parts we can show you," the install is auditable and the catalog is doing the work.

The five components are the part we publish. The contents of any specific brand-install — the rules, the memories, the instructions — stay where they were earned. The shape is shareable; the catalog is per-brand. That is the version of the moat we will defend.

The brand-install is the agency. The team is the part that fills it.

Companion pieces: the install-order view covers the workflow-layer axis (sensor / policy / tool / quality gate / learning). Birkett's five operating layers cover the content-surface axis (Strategy / Enablement / Execution / Feedback / Repurposing). Together with this piece's per-engagement artifact view, the three axes describe what an AI-native services engagement actually ships. And the instructions library is the agency's growing methodology picks up the component this post flagged as the part that compounds — the index of small task files — and shows how a growing library becomes a method that outlives any one engagement. What you own when an AI engagement ends takes the fourth component — the memory index — and shows why the memory layer is the one part of the install the brand keeps when the engagement is over. Where the brand-install is the artifact the AI reads to run the work, what AI says your company is is its outward twin — the artifact the market and the machine read to know you.

Our offices

An AI-native agency ships a brand-install, not a team. Everything else is org chart.

What goes in an ideal brand-install

How the install gets sharper

Where the install breaks

The argument

More articles

AI can fix most of your CRO problems. The two it can’t are probably your real bottleneck.

Why your marketing keeps starting from zero — and the discipline that makes it compound

Tell us about your project

Our offices