Features

What is inside CraftAgent Builder

The model layer, grounding architecture, integrations, evaluation tooling, and analytics that turn a visual agent design into something you can trust in production.

Model layer Grounding: Truth-First Architecture Integration layer Evaluation and testing Analytics dashboard Security baseline

Model layer

Agents run on models from OpenAI and Anthropic, and each step in an agent picks the model that fits it. A fast, cheap model can classify and route; a stronger model drafts and reasons. Routing rules are yours to set, and fallbacks mean a provider outage degrades to a second model instead of taking the agent down.

OpenAI and Anthropic models behind one interface
Per-step model selection and routing rules
Automatic fallback to a second model on provider errors

Grounding: Truth-First Architecture

Most agent failures are confident wrong answers. CraftAgent Builder inherits the Truth-First Architecture from our services work: responses are verified against your trusted knowledge sources before they are delivered, and every answer cites where it came from. Each response carries a confidence score; below your threshold, the agent qualifies its answer, asks for clarification, or refuses outright and escalates to a human. A wrong answer with a citation is easy to catch. A refusal is easy to route. A confident hallucination is neither, so the architecture is built to keep that case off the table.

Answers verified against connected knowledge bases before delivery
Citation-first responses: every claim links to its source
Confidence scoring with refusal and human escalation below threshold

Integration layer

An agent that cannot touch your systems is a chatbot. The builder connects outward through the orchestration tools the legacy stack was built on: trigger an agent from an n8n workflow or a Make scenario, return results back into the same run, and publish agents to agent.ai. For everything else there are REST endpoints and webhooks, so any system that can make an HTTP call can call an agent, and any agent event can notify a system you run.

n8n and Make: trigger agents from workflows, pipe results back
agent.ai publishing for distribution
REST endpoints and webhooks for custom systems

Evaluation and testing

Every agent carries a set of evaluation prompts: the questions it must answer correctly, the requests it must refuse, the edge cases that broke version one. Run the set before each change ships and the diff between runs tells you whether a new prompt, model, or knowledge source made the agent better or quietly worse. Evaluations gate deployment, so "it seemed fine in the chat preview" stops being the bar.

Per-agent evaluation prompt sets, including required refusals
Side-by-side run comparison across model and prompt changes
Evaluation results gate deployment

Analytics dashboard

The dashboard answers the two questions every operator asks: is the agent right, and what does it cost. Accuracy tracking follows citation coverage and evaluation pass rates over time; cost tracking breaks each conversation down to its model calls, so cost per conversation is a number you can read, not estimate. Usage alerts at 80 and 90 percent of your plan limit are planned for general availability, carried over from how we already handle overages in services engagements: you should never learn about a limit from an invoice.

Accuracy over time: citation coverage and evaluation pass rates
Cost per conversation, itemized by model call
Planned for GA: usage alerts at 80 and 90 percent of plan limits

Security baseline

Early access does not mean loose handling. The builder is being built to a fixed baseline from day one: data encrypted in transit and at rest, access scoped to your workspace, and your knowledge sources never used to train models for anyone else. Deployment options for stricter environments, including private deployment, are part of the enterprise track.

Encryption in transit and at rest
Workspace-scoped access to agents and knowledge sources
Your data is not used to train models for other customers

Running in a regulated or locked-down environment? See the enterprise deployment options.

Build your first agent with us

The builder is in early access with design partners. Join the waitlist and we will walk your first agent from canvas to deployment together.

Get early access Read the FAQ

Want it built for you instead? Our services team ships custom agents on this same architecture.