dev-tools 6 min read

Agents Best Practices – AI Agent Design Patterns

A public knowledge base of 150+ agent design patterns from dozens of production deployments. Covers planning loops, tool design, prompt engineering.

#github-trending #dev-tools #ai-agent #best-practices#prompt-engineering
By
Share: X in
agents-best-practices GitHub tool guide thumbnail

TL;DR

TL;DR: Agents Best Practices is a community knowledge base with 150+ patterns for building AI coding agents. Each entry is a documented pattern with a problem statement, solution, code example, and production evidence. Covers planning, tool design, prompts, evals, and guardrails.

Source and Accuracy Notes

What Is Agents Best Practices?

Agents Best Practices is a structured knowledge base for anyone building AI coding agents. Unlike typical “awesome list” repos that just link to resources, each entry follows a consistent format:

  • Problem: What challenge does this pattern solve?
  • Solution: How the pattern works, with concrete examples
  • When to use: The conditions where this pattern applies
  • Evidence: Links to production deployments or research that validate the approach
  • Anti-patterns: What not to do

The patterns are organized into categories: agent architecture, tool design, prompt engineering, planning and reasoning, evaluation and testing, and safety and guardrails.

Repo-Specific Setup Workflow

Step 1: Browse the Knowledge Base

git clone https://github.com/denissergeevitch/agents-best-practices.git
cd agents-best-practices

The patterns live in the patterns/ directory, organized by category. Open any .md file to read a full pattern.

Step 2: Search by Category

The categories/ directory provides indexed overviews. Start with categories/agent-architecture.md for high-level design patterns, then drill into specific topics.

Step 3: Contribute a Pattern

Follow the template in TEMPLATE.md:

# Pattern Name

## Problem
[Describe the challenge]

## Solution
[Explain how the pattern works]

## When to Use
[List conditions]

## Evidence
[Link to production deployments or research]

## Anti-patterns
[What to avoid]

Open a PR with your new pattern file in the appropriate category folder.

Deeper Analysis

The knowledge base distinguishes itself from raw documentation by including anti-patterns — documented mistakes that practitioners have made and learned from. For example, the “over-fragmented tools” pattern warns against breaking tools into too many small functions (which confuses the LLM’s tool selection) and the “missing idempotency keys” pattern covers retry safety for agent tool calls.

Planning patterns are particularly rich. The “plan-then-execute” pattern shows how to separate strategy from implementation, and the “replan-on-failure” pattern demonstrates recovery loops when tool calls fail. Each includes a working code snippet in Python or TypeScript.

The evaluation section covers both automated evals (assertion-based, LLM-as-judge) and human-in-the-loop patterns. Practical advice includes how to write eval cases that catch regressions without overfitting to specific model behaviors.

Practical Evaluation Checklist

  • [ ] 150+ patterns across 6 categories
  • [ ] Every pattern includes problem, solution, evidence, and anti-patterns
  • [ ] Production evidence linked for core patterns
  • [ ] Templates available for contributing new patterns
  • [ ] No dependencies — plain Markdown files

Security Notes

The knowledge base itself is static Markdown — no executable code runs from it. However, patterns covering tool design and guardrails are directly relevant to security: review the “permission-hardened tools” and “input sanitization” patterns before deploying agents to production. The “prompt injection defense” category covers common attack vectors and mitigation strategies.

FAQ

Q: How is this different from LangChain or CrewAI docs? A: This is framework-agnostic. Patterns apply whether you’re using raw API calls, LangChain, or a custom orchestrator. It focuses on design principles, not library-specific APIs.

Q: Are all patterns backed by production evidence? A: No — some are theoretical or experimental. Each pattern labels its evidence level: production, experimental, or theoretical. Production-tagged patterns link to deployed systems.

Q: Can I use these patterns for non-coding agents? A: Yes — the architecture, tool design, and evaluation patterns apply to any LLM agent, not just coding agents.

Q: How often is the knowledge base updated? A: Community-driven via PRs. Check the commit history for recent additions.

Beyond the core pattern catalog, the knowledge base includes detailed case studies from production agent deployments. These aren’t just pattern descriptions — they include the original problem context, the attempted solutions, what failed, what worked, and the final architecture with code. For example, the “staged agent pipeline” case study documents how a code review agent evolved from a single monolithic prompt to a three-stage pipeline (triage → deep review → summary) with measurable improvements in review quality and latency.

The tool design section covers practical concerns that most framework docs skip. A recurring theme is tool granularity: too fine (100 single-purpose tools) and the LLM struggles with tool selection; too coarse (one “do everything” tool) and the LLM can’t reason about specific actions. The patterns show optimal granularity with examples and failure modes at both extremes. Another pattern covers tool idempotency — why every tool should accept an idempotency key and how to implement it with a simple request deduplication layer.

The evaluation section goes beyond simple pass/fail assertions. It covers LLM-as-judge with pairwise comparison, reference-free evaluation using rubric scoring, and regression detection using embedding similarity. Each pattern includes a concrete eval function — copy, paste, and run with your own test cases. The anti-patterns in this section are particularly valuable: “testing against a moving target” (evaluating against live LLM output without pinned snapshots) and “overfitting to a specific model’s quirks” are mistakes that teams make repeatedly.

Q: What is the recommended reading order for the patterns? A: Start with agent-architecture for foundations, then tool-design for practical application, then evaluation for validation. The safety and guardrails patterns are standalone and can be read independently.

Q: Are there patterns specific to Claude Code, Codex, or Cursor agents? A: The patterns are framework-agnostic and apply universally. Some categories like prompt engineering include agent-specific tips in the notes, but the core design principles work regardless of which coding agent you are building on.

Q: How often should I revisit the patterns as agent technology evolves? A: The repo is community-maintained with regular PRs. Check the commit history monthly. Key patterns around architecture and safety tend to be stable, while tool design and prompt engineering patterns evolve more rapidly with new model capabilities.

Conclusion

Agents Best Practices is the reference manual that the AI agent ecosystem has been missing. The anti-pattern documentation alone saves teams from repeating known mistakes, and the production evidence links give credibility to each recommendation. For anyone building agents beyond toy examples, browsing this repo is a faster path to production-readiness than trial and error.