AI Under Rule Control: The 80/20 Problem
Every AI pilot starts the same way. The demo is impressive. The prototype works. Leadership is excited. Then someone asks: "What's the accuracy?"
"About 80%."
"And the other 20%?"
Silence.
The Production Gap
80% accuracy sounds good until you do the math. If your system processes 10,000 decisions per day:
- 8,000 correct decisions
- 2,000 wrong decisions every single day
For loan approvals, that's 2,000 people getting wrong answers. For compliance checks, that's 2,000 potential violations. For fraud detection, that's 2,000 missed or false alerts.
No amount of prompt engineering fixes this. LLMs are probabilistic. They're brilliant at understanding context, extracting information, and generating text. They're terrible at consistent, deterministic decisions.
The Split: AI Extracts, Rules Decide
The solution isn't to remove AI. It's to use AI where it excels and rules where consistency matters.
┌─────────────────────────────────────────────┐
│ Input Document │
│ "The applicant, John Doe, age 34, works │
│ at Acme Corp earning $85,000/year..." │
└──────────────────┬──────────────────────────┘
│
▼
┌─────────────────────────────────────────────┐
│ AI Extraction Layer │
│ LLM extracts structured data: │
│ { name: "John Doe", age: 34, │
│ employer: "Acme Corp", income: 85000 } │
└──────────────────┬──────────────────────────┘
│
▼
┌─────────────────────────────────────────────┐
│ Rule-Based Decision Layer │
│ DMN tables evaluate: │
│ - Eligibility: age ≥ 18 ✓, income > 30k ✓ │
│ - Risk scoring: income/loan ratio = 1.7 │
│ - Decision: APPROVE (rule 5 matched) │
└─────────────────────────────────────────────┘
AI handles the fuzzy part (understanding unstructured text). Rules handle the precise part (making the decision). The decision is always explainable because it maps to a specific rule in a versioned table.
Implementation Pattern
In TIATON, this looks like:
load("//proto", "new")
def extract_application_data(ctx, state):
"""Use LLM to extract structured data from document."""
payload = new("ai.v1.ExtractionRequest", {
"document": state["raw_document"],
"schema": "lending.v1.ApplicationData",
"instructions": "Extract applicant details from this loan application.",
})
return RUNNING, submit_job("ai.v1.ExtractionService/Extract", payload)
def on_extraction_complete(ctx, state):
"""Store extracted data, proceed to rule-based evaluation."""
result = ctx.event.result
state["applicant_name"] = result.data.name
state["age"] = result.data.age
state["income"] = result.data.income
state["employer"] = result.data.employer
state["extraction_confidence"] = result.confidence
return SUCCESS
def evaluate_with_rules(ctx, state):
"""DMN tables take over — deterministic, auditable decision."""
# This handler triggers DMN evaluation automatically
# The predicate_domain evaluates all tables in the domain
# Results are in ctx.facts after DMN evaluation
decision = ctx.facts["loan_eligibility"].decision
state["decision"] = decision
state["decision_rule"] = ctx.facts["loan_eligibility"].matched_rule
return SUCCESS
The LLM extraction might be 95% accurate. But the 5% it gets wrong gets caught by validation rules before any decision is made. And the decision itself is always made by rules — not by the LLM.
Confidence Scoring
For fields where AI extraction is uncertain, you add validation:
def validate_extraction(ctx, state):
if state["extraction_confidence"] < 0.85:
state["needs_human_review"] = True
state["review_reason"] = "Low extraction confidence"
return SUCCESS # Don't fail — route to manual review
# Validate extracted values against rules
if state["age"] < 18 or state["age"] > 120:
state["needs_human_review"] = True
state["review_reason"] = "Age out of range: " + str(state["age"])
return SUCCESS
return SUCCESS
Low confidence? Route to human review. Impossible values? Route to human review. Everything else goes through the rule engine.
The Compliance Conversation
When a regulator asks how you make decisions:
"We use AI to extract information from documents. The extracted data is validated against business rules. If extraction confidence is below 85%, the case goes to human review. All decisions are made by versioned decision tables. Every decision is traced and auditable."
Compare this to:
"We use an AI model that we fine-tuned on historical data. It usually gets it right."
The first answer gets you approved. The second gets you a follow-up investigation.
Architecture Summary
| Layer | Technology | Purpose | Accuracy |
|---|---|---|---|
| Extraction | LLM | Unstructured → structured | ~90-95% |
| Validation | Rules | Catch extraction errors | Deterministic |
| Decision | DMN Tables | Business logic | Deterministic |
| Routing | Agent | Orchestrate the flow | Deterministic |
| Audit | Decision Trace | Full explainability | Complete |
AI does what AI does best. Rules do what rules do best. The result: a system where AI accelerates the process, but rules guarantee the outcome.