Behavior Trees: The Engine Behind Goal-Directed Agents
Under every TIATON agent is a behavior tree. Not a flowchart. Not a state machine. A behavior tree — the same data structure that game developers use to make NPCs decide whether to attack, flee, or patrol.
It turns out this model is remarkably well-suited for business process orchestration.
What Is a Behavior Tree?
A behavior tree is a directed acyclic graph where each node returns one of three statuses:
- SUCCESS — the action completed
- FAILURE — the action failed
- RUNNING — the action is in progress (async)
Nodes compose through two main types:
Sequence — runs children left to right. Stops on first FAILURE.
Sequence
├── ValidateInput → SUCCESS ✓
├── CheckCredit → SUCCESS ✓
├── EvaluateRisk → FAILURE ✗ ← stops here
└── MakeDecision (not reached)
Fallback — runs children left to right. Stops on first SUCCESS.
Fallback
├── TryPrimaryRoute → FAILURE ✗
├── TryBackupRoute → SUCCESS ✓ ← stops here
└── ManualEscalation (not reached)
These two composites give you if/else, try/catch, and sequential execution — all in a composable, declarative structure.
Why Not State Machines?
State machines are the traditional choice for workflow engines. But they have a scaling problem:
States: [idle, validating, credit_check, risk_eval, deciding, approved, rejected, error]
Transitions: idle→validating, validating→credit_check, validating→error,
credit_check→risk_eval, credit_check→error, risk_eval→deciding,
risk_eval→error, deciding→approved, deciding→rejected, deciding→error
That's 10 transitions for 8 states. Now add retry logic, compensation, parallel execution, and timeout handling:
States: [idle, validating, credit_check, credit_check_retry,
risk_eval, deciding, approved, rejected, error,
compensating, compensating_credit, compensating_inventory,
waiting_approval, approval_timeout, ...]
Transitions: 47 transitions and counting...
The number of transitions grows quadratically with states. State machines become unmanageable for complex workflows.
Behavior trees grow linearly. Adding a new step means adding one node. The composition handles the control flow.
TIATON's BT Implementation
The internal/bt package implements behavior trees in Go with extensions for persistence:
// Core types
type Status int
const (
Success Status = iota
Failure
Running
)
// Node interface
type Executor interface {
Execute(ctx context.Context, view *View) Status
}
// Composite nodes
func Sequence(id string, children ...Node) Node
func Fallback(id string, children ...Node) Node
func ParallelWithMemory(id string, policy Policy, children ...Node) Node
The key extension is ParallelWithMemory — a parallel node that remembers which children have completed across ticks. This is what enables async skill execution:
ParallelWithMemory (RequireAll)
├── Skill: check_credit → SUCCESS (completed tick 2)
├── Skill: verify_identity → RUNNING (waiting for external service)
└── Skill: check_sanctions → SUCCESS (completed tick 1)
Tree returns: RUNNING (one child still in progress)
Next tick, when the external service responds:
ParallelWithMemory (RequireAll)
├── Skill: check_credit → (remembered SUCCESS)
├── Skill: verify_identity → SUCCESS (event received)
└── Skill: check_sanctions → (remembered SUCCESS)
Tree returns: SUCCESS (all children done)
Session Persistence
The entire tree state serializes to JSON:
{
"indexes": {
"n1": { "status": "success" },
"n2": { "status": "success" },
"n3": { "status": "running", "composite_index": 1 },
"n4": { "status": "success" },
"n5": { "status": "running" }
},
"jobs": [
{
"job_id": "job_44f1",
"node_id": "n5",
"job_type": "identity.v1.VerificationService/Verify"
}
]
}
Node IDs are deterministic — the same tree definition always produces the same IDs (n1, n2, n3...). This means you can:
- Serialize the session on server A
- Rebuild the tree on server B
- Node IDs match → state maps correctly
- Resume execution seamlessly
The Agent Layer
The internal/bt/agent package builds goal-directed behavior on top of the behavior tree:
Agent Tick:
1. Evaluate predicates (DMN + Starlark)
2. Check which goals are unmet
3. Find eligible skills (requires met, ensures not met)
4. Calculate goal affinity (BFS backward)
5. Select best skill (or parallel set)
6. Build behavior tree nodes
7. Execute one tick of the tree
8. Process commands (job submissions, timer schedules)
9. Record audit data
The agent doesn't hardcode the execution order. It re-evaluates the situation each tick and selects the best action based on current state.
Deterministic Node IDs
This is subtle but critical. When you serialize a session and later rebuild the tree, the node IDs must match:
type Controller struct {
nodeSeq int // deterministic counter
}
func (c *Controller) nextNodeID() string {
c.nodeSeq++
return fmt.Sprintf("n%d", c.nodeSeq)
}
Same Build() call order → same IDs every time. This enables cross-process session resume without storing the tree structure — only the state snapshot.
Why This Matters
| Property | State Machine | Behavior Tree |
|---|---|---|
| Complexity growth | O(n²) transitions | O(n) nodes |
| Parallel execution | Explicit fork/join | ParallelWithMemory |
| Error handling | Per-state handlers | Fallback composition |
| Async resume | Checkpoint + restore | Native RUNNING state |
| Reusability | States are unique | Subtrees are reusable |
| Determinism | Depends on impl | Guaranteed by design |
Behavior trees give TIATON a foundation that scales with workflow complexity while remaining predictable and debuggable. Every decision the agent makes traces back to the tree structure, the current state, and the predicate evaluations — all visible in the session trace.