Why Starlark? Safe Scripting for Business Rules

When you let users write code that runs on your servers, you need guarantees. Not "best practices" — actual, enforced guarantees.

Starlark gives you exactly that.

What Is Starlark?

Starlark is a dialect of Python created by Google for the Bazel build system. It looks like Python, feels like Python, but has critical restrictions that make it safe for embedding:

def validate_application(ctx, state):
    """Check that all required fields are present and valid."""
    errors = []

    if not state["applicant_name"]:
        errors.append("Applicant name is required")

    if state["loan_amount"] <= 0:
        errors.append("Loan amount must be positive")

    if state["loan_amount"] > 1000000:
        errors.append("Loan amount exceeds maximum")

    if len(errors) > 0:
        state["validation_errors"] = errors
        return FAILURE

    state["application_validated"] = True
    return SUCCESS

If you know Python, you can write Starlark. The syntax is nearly identical for the subset that matters: functions, loops, conditionals, string operations, list/dict manipulation.

What Starlark Cannot Do

This is where it gets interesting. Starlark is intentionally limited:

FeaturePythonStarlarkWhy
File I/OYesNoHandlers shouldn't read your filesystem
Network callsYesNoUse submit_job() for controlled async
Import os/sysYesNoNo access to system internals
Infinite loopsYesNoExecution is guaranteed to terminate
ThreadsYesNoNo concurrency bugs in business logic
Global stateYesNoEach execution is isolated
eval()/exec()YesNoNo code injection
ExceptionsYesNoErrors are return values, not control flow

These aren't limitations — they're features. When a compliance officer asks "can this script access our database directly?", the answer is no, by design.

The Module System

TIATON provides controlled capabilities through a module system:

load("//proto", "new")

def check_credit_score(ctx, state):
    payload = new("lending.v1.CreditCheckRequest", {
        "ssn": state["applicant_ssn"],
        "consent_token": state["consent_token"],
    })
    return RUNNING, submit_job("lending.v1.CreditService/Check", payload)

def on_credit_result(ctx, state):
    result = ctx.event.result  # typed lending.v1.CreditCheckResponse
    state["credit_score"] = result.score
    state["credit_report_id"] = result.report_id
    return SUCCESS
  • new() creates typed protobuf messages (validated at creation time)
  • submit_job() submits async work through the runtime (validated against declared async_ops)
  • ctx provides read-only context (skill name, input facts, event data)
  • state is the shared state dict (typed, validated against schema)

Everything goes through the runtime. Every external interaction is declared, tracked, and auditable.

Performance

Starlark compiles to bytecode and executes fast. TIATON adds shared module caching across executions:

Benchmark (engine.star, 7 handlers):
  Cold start:     39μs / 886 allocs
  Warm (cached):   3μs /  71 allocs
  Speedup:        12.5x

For a handler that validates inputs and returns a decision, execution time is measured in microseconds. The bottleneck is always the external services you call — not the scripting layer.

Error Handling

Starlark doesn't have exceptions. Errors are explicit return values:

def process_payment(ctx, state):
    if state["balance"] < state["amount"]:
        state["error"] = "Insufficient balance"
        return FAILURE

    state["balance"] = state["balance"] - state["amount"]
    state["transaction_id"] = generate_id()
    return SUCCESS

FAILURE triggers the compensation cascade. SUCCESS moves to the next skill. RUNNING with submit_job() pauses for async. There are no hidden control flow paths.

Real Example: Signal Processing

Here's a complete handler from a trading signal processor:

load("//proto", "new")

def process_signal(ctx, state):
    """Normalize and validate incoming trading signal."""
    signal = ctx.facts

    # Validate signal
    if signal.symbol == "":
        state["error"] = "Empty symbol"
        return FAILURE

    if signal.leverage < 1 or signal.leverage > 100:
        state["error"] = "Invalid leverage: " + str(signal.leverage)
        return FAILURE

    # Normalize direction
    direction = signal.direction.lower()
    if direction not in ("long", "short"):
        state["error"] = "Unknown direction: " + direction
        return FAILURE

    # Store normalized data
    state["symbol"] = signal.symbol.upper()
    state["direction"] = direction
    state["leverage"] = signal.leverage
    state["source"] = signal.source or "manual"

    return SUCCESS

def open_position(ctx, state):
    """Submit async job to open trading position."""
    payload = new("trading.v1.OpenPositionPayload", {
        "symbol": state["symbol"],
        "direction": state["direction"],
        "leverage": state["leverage"],
    })
    return RUNNING, submit_job("trading.v1.TradingJobs/OpenPosition", payload)

def on_position_opened(ctx, state):
    """Handle async result from position opening."""
    result = ctx.event.result
    state["position_id"] = result.position_id
    state["opened_at"] = result.opened_at
    return SUCCESS

Three functions. Clear contracts. No hidden dependencies. A new team member reads this and understands the signal processing flow in minutes.