Agent Eval Kit
Evaluate coding agents before they touch serious repositories: mini-eval, scorecard, scenario prompts, regression log, and a 30-minute setup workflow.
Use the free previews immediately. Buy the full ZIP when you want the complete templates, checklists, prompts, and machine-readable files ready to adapt.
If you are choosing for a real workflow, start with the higher-leverage kits: Agent Eval Kit when a coding agent will touch production code, or Agent Commerce Starter Kit when a digital product needs agent-readable metadata. Use the €2 handoff kit for context-transfer only.
Evaluate coding agents before they touch serious repositories: mini-eval, scorecard, scenario prompts, regression log, and a 30-minute setup workflow.
Make a digital product easier for agents to discover and understand: llms.txt, product.json, catalog metadata, checkout copy, and buyer-agent QA prompts.
Preserve context when work moves between Claude Code, Codex, Cursor, Copilot, another agent, or a human operator. Includes a structured handoff spec and installable skill preview.
A practical 15-minute pre-flight test for Claude Code, Codex, Cursor, Copilot-style agents, or any autonomous coding workflow.
Copy a practical llms.txt, product.json, and buyer-agent QA prompt for making small digital products easier for agents to parse and recommend.
A practical comparison for builders packaging digital products for AI agents: when to use each file, what agents can parse, and how to avoid overpromising.
Paste a task and the current state. This generates a clean Markdown handoff you can give to another agent or keep as a run log.
Your handoff will appear here.