Features
What is inside CraftAgent Builder
The model layer, grounding architecture, integrations, evaluation tooling, and analytics that turn a visual agent design into something you can trust in production.
01
Model layer
Agents run on models from OpenAI and Anthropic, and each step in an agent picks the model that fits it. A fast, cheap model can classify and route; a stronger model drafts and reasons. Routing rules are yours to set, and fallbacks mean a provider outage degrades to a second model instead of taking the agent down.
- OpenAI and Anthropic models behind one interface
- Per-step model selection and routing rules
- Automatic fallback to a second model on provider errors
02
Grounding: Truth-First Architecture
Most agent failures are confident wrong answers. CraftAgent Builder inherits the Truth-First Architecture from our services work: responses are verified against your trusted knowledge sources before they are delivered, and every answer cites where it came from. Each response carries a confidence score; below your threshold, the agent qualifies its answer, asks for clarification, or refuses outright and escalates to a human. A wrong answer with a citation is easy to catch. A refusal is easy to route. A confident hallucination is neither, so the architecture is built to keep that case off the table.
- Answers verified against connected knowledge bases before delivery
- Citation-first responses: every claim links to its source
- Confidence scoring with refusal and human escalation below threshold
03
Integration layer
An agent that cannot touch your systems is a chatbot. The builder connects outward through the orchestration tools the legacy stack was built on: trigger an agent from an n8n workflow or a Make scenario, return results back into the same run, and publish agents to agent.ai. For everything else there are REST endpoints and webhooks, so any system that can make an HTTP call can call an agent, and any agent event can notify a system you run.
- n8n and Make: trigger agents from workflows, pipe results back
- agent.ai publishing for distribution
- REST endpoints and webhooks for custom systems
04
Evaluation and testing
Every agent carries a set of evaluation prompts: the questions it must answer correctly, the requests it must refuse, the edge cases that broke version one. Run the set before each change ships and the diff between runs tells you whether a new prompt, model, or knowledge source made the agent better or quietly worse. Evaluations gate deployment, so "it seemed fine in the chat preview" stops being the bar.
- Per-agent evaluation prompt sets, including required refusals
- Side-by-side run comparison across model and prompt changes
- Evaluation results gate deployment
05
Analytics dashboard
The dashboard answers the two questions every operator asks: is the agent right, and what does it cost. Accuracy tracking follows citation coverage and evaluation pass rates over time; cost tracking breaks each conversation down to its model calls, so cost per conversation is a number you can read, not estimate. Usage alerts at 80 and 90 percent of your plan limit are planned for general availability, carried over from how we already handle overages in services engagements: you should never learn about a limit from an invoice.
- Accuracy over time: citation coverage and evaluation pass rates
- Cost per conversation, itemized by model call
- Planned for GA: usage alerts at 80 and 90 percent of plan limits
06
Security baseline
Early access does not mean loose handling. The builder is being built to a fixed baseline from day one: data encrypted in transit and at rest, access scoped to your workspace, and your knowledge sources never used to train models for anyone else. Deployment options for stricter environments, including private deployment, are part of the enterprise track.
- Encryption in transit and at rest
- Workspace-scoped access to agents and knowledge sources
- Your data is not used to train models for other customers
Running in a regulated or locked-down environment? See the enterprise deployment options.
Build your first agent with us
The builder is in early access with design partners. Join the waitlist and we will walk your first agent from canvas to deployment together.
Want it built for you instead? Our services team ships custom agents on this same architecture.