The Six Components of an AI-First Engineering Methodology

A reference for the core components that underpin modern AI-first engineering methodologies (such as the publicly documented Gen-e2 framework). Written for someone new to AI engineering.

What an AI-first methodology actually is

It's a methodology — a way of working — not a tool you install or a service you call. It's a coordinated approach to using an agent runtime (GitHub Copilot, Claude Code, and increasingly multi-agent systems) across an entire engineering team.

Strip away any particular branding and the methodology consists of six components:

Component	What it is	Public learning path
1. Custom Copilot Instructions	Markdown files that frame Copilot's behavior per repo	GitHub's official docs
2. Reusable prompt libraries	Folders of saved prompts for recurring tasks	Anthropic Cookbook, OpenAI Cookbook
3. Agentic workflows in markdown	Markdown files in `.github/workflows/` that run agents on triggers	github.com/githubnext/agentics
4. Multi-agent orchestration	Specialized agents working in sequence	Anthropic's "Building effective agents" essay
5. Project context management	Discipline of deciding what context the agent needs	Eugene Yan, Hamel Husain, Simon Willison blogs
6. Human-in-the-loop checkpoints	Designed approval gates for agent output	Standard agent design practice

Each component is described in detail below.

Component 1 — Custom Copilot Instructions

What it is

A markdown file at .github/copilot-instructions.md (or CLAUDE.md for Claude Code) that tells the agent everything it should know about a repo: conventions, style, domain context, common pitfalls, output format expectations. Loaded automatically every time the agent operates in that folder.

Why it matters

Without this file, Copilot defaults to generic behavior — works for any repo, fits no repo well. With this file, Copilot becomes specifically tuned for the codebase. Same model, dramatically better output.

This is the single highest-leverage component of the methodology. Most of its value compresses into "write good instructions files."

Structure that works

# [Project Name]

## Role

[What persona should the agent adopt? Senior engineer? Reviewer? Junior pair?]

## Domain context

- [What does the codebase do?]
- [Key entities and concepts]
- [Database / framework / language specifics]
- [Anything unusual about this codebase]

## Style

- [Naming conventions]
- [Code formatting preferences]
- [Common patterns the codebase uses]
- [Patterns the codebase avoids]

## Workflow

[For complex tasks, the steps to follow]

## Constraints

- [What the agent should never do]
- [What the agent should always check]

## Output format

[How responses should be structured]

What "good" looks like in practice

A bad instructions file says: "Write good Python code."

A good instructions file says:

## Style

- Python 3.11 only — no Python 2 patterns
- We use Pydantic v2 (not v1) — note: `model_validator` not `validator`
- Type hints are required on all function signatures
- Async functions throughout — never mix sync/async in the same module
- We use snake_case for variables and functions, PascalCase for classes
- Test files live alongside source files, suffixed with `_test.py`
- We use pytest, not unittest

## Common pitfalls

- Don't use `dict()` to spread Pydantic models — use `model_dump()`
- Don't import from internal modules across package boundaries; use the public `__init__.py`
- Async context managers in test fixtures must use `pytest_asyncio.fixture`

The difference is specificity. Generic advice is useless to an agent (and to a human). Specific conventions are gold.

Public learning path

GitHub's docs: search "GitHub Copilot custom instructions" — official documentation
Anthropic's Claude Code docs: the CLAUDE.md file pattern is documented in Claude Code's quickstart
Real examples: browse .github/copilot-instructions.md files in popular open-source repos that use Copilot

Component 2 — Reusable prompt libraries

What it is

A folder (commonly named prompts/) of saved prompt files for recurring tasks. Version-controlled. Composable. Each file is a focused instruction for one type of work.

Why it matters

Engineers tend to retype prompts every time. The good prompts are forgotten; the bad prompts are reinvented; quality drifts. A library captures what works and lets you reuse it.

Same impact as code reusability, applied to prompts.

Examples of useful prompts in a library

For a data-processing project:

draft-script.md — the standard drafting prompt
review-script.md — review a draft for safety
extract-intent.md — for ambiguous tickets
rollback-strategy.md — generate the rollback section

For a general engineering team:

explain-this-code.md — explain a function or module
propose-refactor.md — suggest a refactoring with rationale
code-review.md — review a PR by a teammate
write-unit-tests.md — generate tests for a function
migrate-deprecated-api.md — find and migrate uses of a deprecated API
summarize-pr.md — generate a PR description from a diff

What "good" looks like

Each prompt file has:

A clear single purpose
Context about how to behave (role, persona)
Step-by-step workflow
Output format
A note on when to use it

Public learning path

Anthropic Cookbook: github.com/anthropics/anthropic-cookbook — real, working prompts for common tasks
OpenAI Cookbook: github.com/openai/openai-cookbook
Awesome Copilot: github.com/copilot-academy — community-curated Copilot prompts
Practitioner blogs: Hamel Husain, Simon Willison post specific prompts they use

Component 3 — Agentic workflows in markdown

What it is

A specific GitHub feature: markdown files placed in .github/workflows/ that describe agent tasks in natural language, run by GitHub Actions on triggers (a new issue, a new PR, a scheduled time, etc).

Format roughly:

---
on:
  issues:
    types: [opened]
permissions:
  contents: read
  issues: write
engine: copilot
---

# Triage New Issues

When a new issue is opened:

1. Read the issue title and body
2. Determine the appropriate label from the existing label list
3. Add the label to the issue
4. If the issue is a bug report, add a comment asking for reproduction steps if missing

The agent reads this, runs the steps, takes the actions.

Why it matters

This is how agents become background workers instead of just interactive chat tools. They run when something happens — without you having to invoke them.

For a team, agentic workflows can:

Auto-triage incoming issues
Auto-summarize merged PRs at end of day
Auto-generate weekly status updates from repo activity
Auto-flag PRs that change critical files
Auto-draft responses to common bug reports

Public learning path

GitHub's official docs on Agentic Workflows — search "GitHub Agentic Workflows"
Awesome Copilot's agentic workflows learning hub — examples and templates
github.com/githubnext/agentics — community-maintained sample workflows you can install

When to use it

Agentic workflows shine once a project has moved past being a purely interactive tool. Good candidates: "when a new ticket is created, auto-draft a starting point and post it as a comment," or "every Friday, summarize this week's tickets." Don't force it — use agentic workflows where they genuinely add value, not just to use the feature.

Component 4 — Multi-agent orchestration

What it is

Using multiple specialized agents in coordination instead of one general agent. Each agent has a focused role, focused context, focused output.

A common split for a drafting pipeline:

Analyzer agent — reads the input, extracts intent, identifies ambiguities
Drafter agent — takes the intent + past examples, drafts a result
Reviewer agent — checks the draft for safety and correctness

Why it matters (and when it doesn't)

The benefit of multi-agent: each agent has focused context, which means more accurate output. Bugs are easier to localize ("the analyzer got it wrong" vs "the giant prompt got something wrong"). You can apply evals per agent.

The cost: more orchestration complexity, more tokens used, more chances for the chain to break.

Important nuance: there's an active debate in the field about whether multi-agent systems are usually better than a single well-prompted agent. Anthropic's "Building effective agents" essay argues for them when warranted. Cognition's "Don't build multi-agents" essay argues most teams overcomplicate. Read both — the disagreement is the lesson.

Start single-agent. Move to multi-agent only when you have evidence that a single well-prompted agent can't reach the quality you need. Don't add complexity speculatively.

Public learning path

Anthropic's "Building effective agents": anthropic.com/engineering/building-effective-agents — the foundational essay
Cognition's "Don't build multi-agents": the counter-argument
Microsoft's Semantic Kernel cookbook: examples of multi-agent in production

Component 5 — Project context management

What it is

The discipline of deliberately deciding what context an agent needs to do its job — and what context it doesn't need.

This sounds simple. It isn't. The natural tendency is either to dump everything ("just give the agent the whole repo") or starve it ("just send the immediate question"). Both fail. The skill is curating the right context for the task.

Why it matters

Models perform dramatically better with the right context. Wrong context = hallucinations, irrelevant outputs, or just confused responses. Models also have token limits — too much context costs money and slows responses; too little misses critical information.

Most "AI doesn't work" experiences are actually "I gave the AI the wrong context" experiences.

What this looks like in practice

For a drafting agent over past examples:

The agent needs access to past examples (otherwise it has no patterns to follow)
The agent needs the new input (otherwise it has no task)
The agent doesn't need the entire codebase (would be overwhelming and confuse it)
The agent needs style conventions (in copilot-instructions.md)
The agent might benefit from the schema of affected data (consider adding to instructions)

Public learning path

Eugene Yan: eugeneyan.com/writing — read his posts on patterns for LLM systems
Hamel Husain: hamel.dev — read his series on building LLM products
Simon Willison: simonwillison.net — practitioner blog with frequent context-management posts
The "context engineering" emerging field: search "context engineering LLM" — this is becoming a recognized skill area

Component 6 — Human-in-the-loop checkpoints

What it is

Designing where a human approves agent output, and what the agent is allowed to do without approval.

A simple rule for a drafting agent: the agent drafts, but never merges. The merge is the human checkpoint.

For other systems, checkpoints might be:

Agent proposes a code change → human reviews PR → human merges
Agent drafts an email → human reads and clicks send
Agent identifies a problem → human approves the fix
Agent runs in low-stakes environment freely → escalates to human for high-stakes decisions

Why it matters

Two reasons:

Trust. Even when agents are right 95% of the time, that 5% can be catastrophic in production. A human checkpoint catches the 5%.
Adoption. Engineers won't use a tool they don't trust. Visible approval gates build trust over time.

What "good" looks like

A well-designed human-in-the-loop system:

Has the human reviewing the right things (not everything; not nothing)
Makes the review fast and ergonomic (a 30-second yes/no, not a 30-minute deep dive)
Gives the reviewer context (why did the agent make this choice? what did it consider?)
Has clear escalation paths when the human isn't sure

Public learning path

This is mostly absorbed from practice and from blog posts on agent design. There's no canonical text. Look for:

Cognition's blog on Devin's design choices
Posts on PR-review-as-checkpoint patterns
Discussions of "agentic workflows with approval gates"

Putting it together

These six components compose. A mature AI-first setup typically has:

Custom instructions per repo (Component 1)
A reusable prompt library (Component 2)
Deliberate context management (Component 5)
Human checkpoints at the right places (Component 6)

…with agentic workflows (Component 3) and multi-agent orchestration (Component 4) added later, only where they earn their complexity.

The branding around any particular methodology is unimportant. The underlying skills — specificity, reuse, context discipline, and well-placed human review — are what matter. Learn them gradually, one component at a time, by applying each to a real project.

The Six Components of an AI-First Engineering Methodology

What an AI-first methodology actually is​

Component 1 — Custom Copilot Instructions​

What it is​

Why it matters​

Structure that works​

What "good" looks like in practice​

Public learning path​

Component 2 — Reusable prompt libraries​

What it is​

Why it matters​

Examples of useful prompts in a library​

What "good" looks like​

Public learning path​

Component 3 — Agentic workflows in markdown​

What it is​

Why it matters​

Public learning path​

When to use it​

Component 4 — Multi-agent orchestration​

What it is​

Why it matters (and when it doesn't)​

Public learning path​

Component 5 — Project context management​

What it is​

Why it matters​

What this looks like in practice​

Public learning path​

Component 6 — Human-in-the-loop checkpoints​

What it is​

Why it matters​

What "good" looks like​

Public learning path​

Putting it together​

What an AI-first methodology actually is

Component 1 — Custom Copilot Instructions

What it is

Why it matters

Structure that works

What "good" looks like in practice

Public learning path

Component 2 — Reusable prompt libraries

What it is

Why it matters

Examples of useful prompts in a library

What "good" looks like

Public learning path

Component 3 — Agentic workflows in markdown

What it is

Why it matters

Public learning path

When to use it

Component 4 — Multi-agent orchestration

What it is

Why it matters (and when it doesn't)

Public learning path

Component 5 — Project context management

What it is

Why it matters

What this looks like in practice

Public learning path

Component 6 — Human-in-the-loop checkpoints

What it is

Why it matters

What "good" looks like

Public learning path

Putting it together