The agentic AI starter kit: minimum viable setup for software teams

Written by Ionut Lomer | May 20, 2026 7:52:25 AM

Part 2 of 2. This article follows "Claude is not a chatbot: how to use it on real software projects".

Agentic AI is like a new machine. A powerful one. But nobody shipped a user manual with it, and every company in the room is currently trying to figure out which button does what.

That is the honest state of things in 2026. Anthropic is shipping new features faster than most teams can absorb them. Documentation reads like walking into a store where every shelf has something new and there is no map. The instinct is to explore everything. That instinct is the problem.

The teams getting real value from agentic AI right now are not the ones chasing every new capability. They are the ones who picked one small thing, understood it completely, and built from there. This article is the practical entry point: how to start small, how to brief agents like new hires, how to structure the work on disk, and where the safe high-value territory actually sits today.

1. The mindset: start with what you already understand

The best advice I ever got about learning to code was: build a to-do list. Not because a to-do list is interesting. Because it contains every fundamental concept you need to understand before building anything else. State, input, output, persistence. Master those in a simple context and the complex ones become extensions, not new subjects.

Agentic AI has a to-do list equivalent, and most people skip it because it does not look impressive enough. That is the mistake.

Before thinking about autonomous agents, parallel execution, or swarm workflows, ask yourself a simpler question: what does my team actually look like? A PM who oversees delivery, tracks blockers, and communicates progress. A developer who writes code against a defined spec. A tester who validates output before it ships. That structure is not new. You already understand it. Start there.

One agent. One role. One task at a time. That is the to-do list of agentic AI. It is also where you learn everything that matters.

A single agent working sequentially on a defined task teaches you more about how agentic systems actually behave than any demo of a fully autonomous swarm. You learn what the agent needs before it starts. Where it gets confused. How to brief it so the output is useful on the first pass. You cannot skip this stage and build something reliable. Nobody can.

2. The foundation: before the agents start, the vision has to exist

Every project I build with agentic AI starts the same way, and it does not start with an agent. It starts with a conversation.

I take the idea, sit with Claude, and work through it out loud. Who is this for? What problem does it actually solve? What does done look like? That back and forth is not setup. It is the most important work in the entire project, because what comes out of it is the product vision document.

The product vision is the living document that feeds everything. Before any agent touches a task, they read the product vision. It tells them what they are building, why it exists, who it is for, and what success looks like. Without it, agents optimize for task completion with no understanding of what the task is actually in service of.

With it, every agent in the system is aligned before the first line of code is written. The vision is not documentation. It is the shared brain.

From the product vision, the workflow unfolds in a sequence that mirrors exactly how a real product team operates:

2.1 Idea to product vision

Back and forth conversation with Claude to define the problem, the audience, and what done looks like. The output is a single document that everything else feeds from.

2.2 PM agent slices the work

The PM agent reads the vision and breaks it into epics, features, and tasks. The backlog is structured before a developer touches anything.

2.3 Developer agent picks up tasks

The developer agent works from the backlog. It knows the stack, the conventions, and the definition of done because those were defined before it started.

2.4 Tester agent validates output

The tester agent checks the work against the spec, not against a generic checklist. Because the spec came from the vision, the validation is calibrated to the actual product.

2.5 Human review before anything ships

Always. A senior engineer, an architect, a tester who runs penetration tests. The agents build. The humans decide what ships.

3. The job description layer: an agent is only as good as its briefing

Here is the practical question most agentic setups do not answer: how does the agent know how to behave? What conventions to follow? What to read before starting? What the team's definition of done actually is?

The answer is an MD file. One per role. Written the same way you would write a job description for a new hire: what this role is responsible for, what they need to read before starting, what rules they follow, what output format they produce. Claude reads MD files natively. The agent reads its role file before every session. It never starts cold.

PM.md covers objectives, how to track blockers, what a sprint update looks like, how to communicate with the team, when to escalate.

Developer.md covers stack conventions, folder structure, what documents to read before writing a line of code, definition of done, coding standards.

Tester.md covers what to test, edge case priorities, how to document findings, what never ships without test coverage.

Product-vision.md is the shared brain. Every agent reads this first. The problem, the audience, the success criteria, the constraints.

Each file is a contract between the human team and the agent. Written once, loaded every session, never re-briefed. The methodology is not in someone's head or in a Confluence page nobody reads. It is in the file the agent opens before it does anything.

Write the agent's briefing like a job description.

4. The structure: what this actually looks like on disk

Concepts are useful. Folder trees are actionable. Here is what the methodology layer looks like when it exists as real files in a real project.

Start here. This is the to-do list version: the minimum viable agentic structure that gives every agent what it needs before it starts, without overwhelming anyone who is still learning how the machine works.

your-project/
├── .claude/
│   └── agents/
│       ├── pm-agent.md          → PM role, objectives, how to track blockers
│       ├── developer-agent.md   → Stack conventions, definition of done
│       └── tester-agent.md      → What to test, what never ships without coverage
├── ai/
│   └── context/
│       └── product-vision.md    → The shared brain. Every agent reads this first.
├── src/
└── CLAUDE.md                    → The front door. Claude reads this before anything else.

CLAUDE.md deserves a specific mention. It sits at the root of every project and is the first file Claude reads when a session starts. It tells Claude what this project is, which agent files to load, which context documents to read, and what rules apply before any task begins. It is the front door of the entire system. Without it, every session starts cold. With it, every session starts briefed.

See the full swarm-mode structure (40+ files, for reference)

your-project/
├── .claude/
│   ├── agents/
│   │   ├── pm-agent.md                  → PM role and delivery oversight
│   │   ├── developer-agent.md           → Stack, conventions, DoD
│   │   ├── tester-agent.md              → Test strategy and rubric
│   │   └── community-manager-agent.md   → Comms, tone, release notes
│   ├── agent-memory/
│   │   ├── meto-developer/              → Per-agent persistent memory
│   │   │   └── MEMORY.md
│   │   └── meto-pm/
│   │       └── MEMORY.md
│   └── rules/
│       ├── agent-developer.md           → NEVER DO / ALWAYS rules card
│       └── agent-tester.md              → Testing boundaries and limits
├── ai/
│   ├── context/
│   │   ├── product-vision.md            → The shared brain
│   │   ├── tech-stack.md                → Stack decisions and rationale
│   │   ├── decisions.md                 → Architectural decisions log
│   │   └── test-log.md                  → Running test history
│   ├── tasks/
│   │   ├── tasks-backlog.md             → Full backlog, all epics
│   │   ├── tasks-in-progress.md         → Active work across agents
│   │   ├── tasks-in-testing.md          → Awaiting tester validation
│   │   └── tasks-done.md                → Completed and approved work
│   ├── swarm/
│   │   ├── SWARM_AWARENESS.md           → Live swarm state, who is doing what
│   │   └── domain-map.md                → File ownership per agent
│   ├── handoff/
│   │   └── current.md                   → Written at end of session, read at start of next
│   └── workflows/
│       ├── code-guidelines.md           → Coding standards and patterns
│       ├── definition-of-done.md        → What done actually means
│       └

One file worth calling out specifically: ai/handoff/current.md. Claude has no memory between sessions by default. Every session starts cold unless you solve for it. The handoff file is that solution. Each agent writes a structured summary at the end of its session: what was completed, what is in progress, what the next agent needs to know before starting. The next session reads it first. Continuity without re-briefing. That single file solves one of the most common failure points in agentic workflows.

Once the sequential setup is working reliably, the same principles scale to multiple agents coordinating in parallel. That is swarm mode, which builds on this foundation rather than replacing it. A separate piece for when there is more to say about it.

5. The honest reality: nobody is shipping agentic to production yet, and that is the right call

There are real horror stories. Agents that deleted three months of database records. Autonomous systems that made decisions nobody intended at 3am with no one watching. The fear around agentic AI shipping directly to production is not irrational. It is based on things that actually happened.

The answer is not to avoid agentic AI. It is to understand where the safe and high-value territory actually is right now. And that territory is internal.

Think about the handoff between design and engineering on a real product build. The design rationale lives in Figma comments and someone's memory. The architectural decisions are in a Slack thread from three sprints ago. The new engineer spends two weeks getting lost before they can contribute. That is the problem agentic AI solves today, not autonomous production deployment.

Agents that document the codebase as it is being built. Agents that generate onboarding playbooks calibrated to the actual project. Agents that keep the gap between design intent and engineering output visible in real time. Safe, contained, high value, and almost entirely unexplored by most teams.

The senior engineer still reviews everything before it ships. The architect still owns the structural decisions. The tester still runs penetration tests before anything goes user-facing. The agents do not replace that layer. They make the layer below it faster, more consistent, and less dependent on institutional memory that walks out the door when someone leaves.

6. The manual does not exist yet. You are writing it.

Every company experimenting with agentic AI right now is doing the same thing: trying to figure out which combination of roles, briefings, workflows, and review gates produces something reliable. Nobody has the definitive answer. The teams that will have it in twelve months are the ones starting with the small, sequential, well-briefed version today.

Build the to-do list. Understand the machine. Write the job descriptions. Load the vision before the agents start. Keep the human in the loop at the gate that matters. Then scale when you know what you are scaling.

The manual does not exist yet. The teams writing it from real experience are the ones who will know how to run the machine when everyone else is still reading the error messages. The question for the next twelve months is not whether to start, but how small to start, and how honest to be about what you learn along the way.

Ionut Lomer is a Project Manager and AI Product Builder at Thinslices. This is the second in a two-part series on AI-native delivery, following "Claude is not a chatbot: how to use it on real software projects".

View full post