Practical guides

Guides

Readable, action-oriented guides for teams that need AI output to be checked, logged, and turned into workflow value.

Agent Incident Review

A structured process for reviewing failed AI agent runs and turning traces into controls, fixtures, owners, follow-up tests, and safer workflows.

Expanded incident process Updated 2026-06-12

Agent Observability Guide

A practical observability model for AI agents that use tools, retrieval, state, retries, approvals, and human review in production workflows.

Expanded guide Updated 2026-06-12

Agent Permission Design

How to design AI agent permissions with read, draft, write, approval, rollback, and audit boundaries before production use or team rollout safely.

Expanded guide Updated 2026-06-12

Agent Reliability Scorecard

A scoring model for agent permissions, evidence, recovery, data exposure, and human review.

Scorecard Updated 2026-05-31

AI Agent Failure Modes

A field guide to common AI agent failures, the controls that reduce them, and the evidence reviewers need before launch, rollout, or incident review.

Expanded risk guide Updated 2026-06-12

AI Code Review Checklist

A practical checklist for reviewing AI-assisted code changes with scoped diffs, tests, security checks, and evidence before final merge approval.

Expanded checklist Updated 2026-06-12

AI Code Review Prompts

Prompt patterns for focused AI code review that ask for high-risk bugs, line evidence, reproduction steps, missing tests, and confidence notes.

Expanded prompt guide Updated 2026-06-12

AI Code Verification Tests

A practical guide to turning AI-generated code into testable behavior with regression tests, boundary checks, and evidence-focused review notes.

Expanded guide Updated 2026-06-12

AI Documentation Review

A documentation review guide for checking AI-written docs against source behavior, examples, links, version notes, and reviewer evidence before publishing.

Expanded guide Updated 2026-06-12

AI Hallucination Testing Guide

A practical hallucination testing process for finding unsupported claims, weak refusals, weak citations, and source-faithfulness failures early.

Expanded guide Updated 2026-06-12

AI Pilot Readiness Checklist

A practical AI pilot readiness checklist for scope, users, success metrics, data boundaries, validation, rollout gates, and stop conditions.

Expanded checklist Updated 2026-06-12

AI Readiness for Teams

An AI readiness guide for teams covering workflow fit, data boundaries, review capacity, tool ownership, risk controls, and pilot evidence first.

Expanded guide Updated 2026-06-12

AI Summary Verification

A summary verification guide for checking AI summaries against sources, preserving caveats, detecting omissions, and logging reviewer decisions.

Expanded guide Updated 2026-06-12

AI Tool Change Log Process

A change log process for tracking AI tool approvals, risks, owners, data access, workflow changes, validation evidence, and renewal decisions.

Expanded operating pattern Updated 2026-06-12

AI Tool Privacy Checklist

A privacy checklist for evaluating AI tools before uploading customer data, source code, employee records, strategy notes, or private documents.

Expanded checklist Updated 2026-06-12

AI Tools for Product Managers

A practical guide to AI tools for product managers, focused on research, specs, prioritization, review gates, and source-backed decisions at work.

Expanded tool selection guide Updated 2026-06-12

AI Tools for Startup Founders

A founder guide to AI tools for startup work, covering research, support, coding, operations, automation, privacy, and verification gates for teams.

Expanded tool selection guide Updated 2026-06-12

AI Workflow Automation for Startups

How startups can automate AI workflows with clear owners, narrow permissions, review gates, evidence logs, and measurable operational wins safely.

Expanded guide Updated 2026-06-12

AI-Generated Code Testing

A practical testing workflow for AI-generated code that covers expected behavior, edge cases, regression checks, and reviewer confidence before merge.

Expanded guide Updated 2026-06-12

Benchmark Methodology

A benchmark methodology guide for creating fair AI tool evaluations with frozen fixtures, dated evidence, scoring rubrics, and retest rules.

Expanded methodology Updated 2026-06-12

Eval Rubric Design

A practical guide to designing AI evaluation rubrics with clear scoring dimensions, weights, failure labels, and decision thresholds for teams.

Expanded framework Updated 2026-06-12

How to Choose an AI Coding Assistant

A buyer and operator guide for choosing AI coding assistants by workflow fit, privacy boundary, validation burden, and reviewer effort in practice.

Expanded tool selection guide Updated 2026-06-12

How to Reduce Hallucinations in LLM Apps

A practical system design checklist for reducing unsupported LLM claims with retrieval, refusal behavior, verification, and review controls.

Expanded guide Updated 2026-06-12

How to Verify AI-Generated Code

A practical guide for reviewing AI-generated code with behavior checks, scoped diffs, tests, security review, and merge evidence notes for teams.

Expanded core guide Updated 2026-06-12

Human-in-the-Loop AI Workflows

Where human review belongs in AI coding, RAG, agent, support, and documentation workflows, with approval gates, evidence checks, and owner roles.

Expanded guide Updated 2026-06-12

LLM Evaluation Framework

A practical LLM evaluation framework for testing correctness, faithfulness, format compliance, safety, latency, and human review effort before launch.

Expanded framework Updated 2026-06-12

LLM Output Verification Guide

A practical workflow for checking LLM output against sources, tests, logs, and human review before using it safely in products or team decisions.

Expanded guide Updated 2026-06-12

MCP Workflow Guide

How to evaluate MCP-style tool connections for AI workflows with narrow permissions, logging, approval gates, and data exposure controls before launch.

Expanded guide Updated 2026-06-12

Model Pricing Change Tracker

A model pricing change tracker workflow for monitoring plan changes, source evidence, affected pages, retest needs, and update decisions over time.

Expanded operating pattern Updated 2026-06-12

Product Manager AI Research Workflow

A product manager AI research workflow for source-backed discovery, synthesis, opportunity notes, stakeholder review, and decision evidence.

Expanded workflow guide Updated 2026-06-12

Prompt Testing Framework

A practical framework for testing prompt variants with frozen fixtures, model settings, scoring rubrics, failure labels, and review notes for teams.

Expanded framework Updated 2026-06-12

RAG Evaluation Checklist

A practical checklist for evaluating RAG retrieval quality, source faithfulness, citations, no-answer behavior, latency, and human review effort.

Expanded checklist Updated 2026-06-12

RAG No-Answer Testing

A practical guide to testing whether a RAG system refuses unsupported, missing-source, ambiguous, stale, or out-of-policy questions safely today.

Expanded guide Updated 2026-06-12

Source-Backed AI Writing

A source-backed AI writing workflow for claims, citations, drafts, verification, reviewer notes, and publication decisions without invented evidence.

Expanded guide Updated 2026-06-12

Startup AI Stack Guide

A startup AI stack guide for choosing lean tools across coding, research, support, content, analytics, automation, and governance without sprawl.

Expanded tool selection guide Updated 2026-06-12

Weekly AI Workflow Operating System

A weekly operating system for reviewing AI workflows, incidents, evidence, tool changes, owner actions, and measurable reliability improvements.

Expanded operating system Updated 2026-06-12

What Is an AI Agent?

A practical definition of AI agents focused on goals, tools, state, permissions, evidence, stop rules, and operator review for real workflows.

Expanded guide Updated 2026-06-12