Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Operator Patterns

These patterns govern how the Agentic OS interacts with external tools, APIs, and services.


Tool as Operator

Intent

Wrap every external tool as a typed, permissioned, observable operator.

Context

Raw tool access — “give the model a function and let it call it” — lacks governance, typing, and observability. Wrapping tools as operators provides the control surface needed for production systems.

Forces

  • Direct tool access is fast and simple but ungoverned
  • Governance and observability add overhead that must be justified
  • Different tools have wildly different interfaces, reliability, and risk profiles
  • The system must treat all tools uniformly while respecting their differences

Structure

An operator wraps a tool with:

  • Type signature — Inputs and outputs are explicitly typed
  • Permission requirements — What capabilities are needed to invoke it
  • Risk classification — Low, medium, high
  • Observability — Invocations are logged with inputs, outputs, latency, and errors
  • Error handling — Failures are captured and returned as structured results
flowchart LR
  W[Worker] -->|invoke| Op[Operator Wrapper]
  subgraph Op[Operator]
    Auth[Check Permissions]
    Log1[Log Request]
    Tool[External Tool]
    Log2[Log Response]
    Err[Error Handler]
  end
  Auth --> Tool
  Tool --> Log2
  Tool -->|error| Err
  Op -->|result| W

Dynamics

When a worker invokes an operator, the wrapper first validates permissions against the worker’s capability set. If authorized, it logs the request, invokes the underlying tool, logs the response (or error), and returns a typed result. The invocation is fully observable — latency, success/failure, input/output hashes — without exposing sensitive content.

Benefits

Governed, observable, reliable tool access. Consistent interface regardless of underlying tool implementation.

Tradeoffs

Wrapper overhead adds latency to every tool invocation. Maintaining operator wrappers requires effort when underlying tools change their interfaces.

Failure Modes

The wrapper obscures tool-specific error details behind a generic error type, making diagnosis difficult. Permission checks are too coarse, blocking legitimate use or too permissive, allowing unauthorized access. Observability logging captures sensitive data that should not be persisted.

Operator Registry, Operator Isolation, Skill over Operators


Operator Registry

Intent

Maintain a central catalog of all available operators with their metadata, permissions, and status.

Context

As the number of available tools grows, the system needs a way to discover, select, and manage them. Without a registry, tool selection is ad-hoc and ungoverned.

Forces

  • Tool sprawl — dozens of tools with overlapping capabilities
  • Workers need to discover available tools dynamically, not through hardcoded references
  • The registry must be the single source of truth for tool availability and governance

Structure

A registry that stores for each operator:

  • Name and description
  • Type signature
  • Permission requirements
  • Risk classification
  • Status (active, deprecated, disabled)
  • Usage metrics

The kernel consults the registry when deciding which operators to make available to a worker.

Dynamics

At worker spawn time, the kernel queries the registry for operators that match the worker’s task requirements and capability set. The registry returns only active, authorized operators. At runtime, operators may be added, deprecated, or disabled without restarting the system. Usage metrics (invocation count, error rate, latency) are updated after each invocation and inform future selection decisions.

Benefits

Central governance point. Dynamic capability management. Clear documentation.

Tradeoffs

The registry becomes a single point of failure for tool discovery. Registry maintenance requires discipline — stale entries mislead workers.

Failure Modes

The registry contains stale entries for tools that no longer exist, causing invocation failures. Deprecated operators are still selected because the replacement was not registered. The registry grows to include hundreds of operators, making tool selection noisy and imprecise.

Tool as Operator, Capability-Based Access


Skill over Operators

Intent

Compose multiple operators into a higher-level reusable recipe called a skill.

Context

Many real-world tasks require a specific sequence of tool invocations with logic connecting them. Rather than having the model improvise this sequence every time, encode it as a skill.

Structure

A skill is a named, tested recipe that:

  • Combines specific operators in a defined sequence or graph
  • Includes logic for handling intermediate results
  • Has its own input/output contract
  • Is registered and versioned

Example:

flowchart LR
  subgraph Skill["Skill: Code Review"]
    direction TB
    S1["1. git.diff\nGet changes"]
    S2["2. file.read\nRead changed files"]
    S3["3. analyze\nAssess quality & risks"]
    S4["4. comment.create\nPost review feedback"]
    S1 --> S2 --> S3 --> S4
  end
  W[Worker] -->|invoke| Skill
  Skill -->|result| W

Dynamics

When a worker invokes a skill, the skill engine steps through the defined sequence, invoking each operator and passing results forward. At each step, the engine evaluates success criteria before proceeding. If a step fails, the skill’s error handling logic determines the recovery strategy (retry, skip, abort). Skills are versioned — updating a skill creates a new version while preserving the previous one. Workers always invoke a specific skill version or the latest.

Benefits

Consistency and reliability. Tested workflows. Reusable across contexts.

Tradeoffs

Skills are less flexible than improvised workflows. They must be maintained as tools and APIs evolve.

Failure Modes

A skill encodes an operator sequence that worked at creation time but breaks after an operator’s interface changes. Skills are too rigid, forcing workers through unnecessary steps. Skills are too numerous and overlapping, making it unclear which skill to use for a given task.

Composable Operator Chain, Patternized Skills


Composable Operator Chain

Intent

Allow operators to be chained into pipelines where the output of one becomes the input of the next.

Context

Many tasks are naturally pipelines: fetch → transform → validate → store. Expressing these as chains makes them composable and reusable.

Forces

  • Multi-step operations need clear data flow between stages
  • Tight coupling between stages prevents reuse
  • Each stage in a chain may fail independently

Structure

Operators expose typed inputs and outputs. The system matches output types to input types, forming a pipeline. Each step in the chain is independently observable and governable.

flowchart LR
  A[Fetch] -->|data| B[Transform]
  B -->|transformed| C[Validate]
  C -->|valid| D[Store]
  C -->|invalid| E[Error Handler]

Dynamics

The chain executes sequentially. Each operator receives the previous operator’s output as its input. Type checking at chain boundaries catches mismatches before invocation. If any operator fails, the chain stops and reports which stage failed, with the partial results collected so far. Chains can be defined declaratively and stored as reusable recipes.

Benefits

Clean data flow. Each operator is testable in isolation. Chains are composable — new pipelines from existing operators.

Tradeoffs

Chains are rigid — branching logic requires escaping the pipeline model. Long chains amplify latency from sequential execution.

Failure Modes

A type mismatch between stages causes a runtime error that should have been caught at chain definition time. A mid-chain failure loses the partial results from earlier stages. The chain abstraction is applied to operations that are not truly pipelines, forcing awkward data transformations between stages.

Skill over Operators, Tool as Operator


Operator Isolation

Intent

Ensure that a failure in one operator does not crash the system or corrupt other operators.

Context

External tools fail. APIs time out. Services return errors. These failures must be contained.

Forces

  • External tools are outside the system’s control — they can fail in unexpected ways
  • A single tool failure should not propagate to other tools or workers
  • Isolation must not add so much overhead that tool invocation becomes impractical

Structure

Each operator invocation runs in its own error boundary. Failures are captured as structured results (not exceptions) and returned to the caller. The caller (worker or kernel) decides how to handle the failure.

Dynamics

The operator wrapper intercepts all failure modes: exceptions, timeouts, malformed responses, and resource exhaustion. Each failure is converted to a structured error result with a failure category, message, and the partial output (if any). The wrapper enforces a timeout: if the underlying tool does not respond within the configured window, the invocation is terminated and a timeout error is returned. The worker receives the error result and decides: retry, fall back, or escalate.

Benefits

System stability. Graceful degradation. Clear error handling paths.

Tradeoffs

Isolation overhead adds latency. Structured error conversion may lose tool-specific diagnostic details.

Failure Modes

The isolation boundary leaks — a tool that consumes excessive memory affects other operators sharing the same process. Timeout values are set too aggressively, killing tools that are slow but would eventually succeed. The structured error result lacks enough detail for the worker to choose the right recovery strategy.

Operator Fallback, Failure Containment


Operator Fallback

Intent

When a primary operator fails, automatically attempt an alternative operator that can fulfill the same need.

Context

External services are unreliable. Having a fallback operator reduces the impact of failures.

Structure

The operator registry associates fallback operators with primary operators:

flowchart LR
  W[Worker] -->|invoke| P["Primary: search_web\n(API A)"]
  P -->|success| R[Result]
  P -->|timeout / 5xx| F["Fallback: search_web_alt\n(API B)"]
  F -->|success| R
  F -->|failure| Err[Combined Error]

Dynamics

When the primary operator returns a failure matching the fallback condition, the system automatically invokes the fallback operator with the same inputs. The fallback result is returned to the caller transparently. If the fallback also fails, the combined failure is reported. Fallback invocations are flagged in the execution journal so that persistent primary failures trigger operational alerts.

Benefits

Higher reliability. Transparent to the caller.

Tradeoffs

Fallback operators may have different characteristics (latency, result quality). Managing fallback chains adds complexity.

Failure Modes

The fallback operator has subtly different semantics than the primary, producing results that appear correct but differ in important ways. Both primary and fallback fail, but the combined error message is confusing. Fallback masks a systemic primary failure, delaying investigation.

Operator Isolation, Tool as Operator


Resource-Aware Invocation

Intent

Consider resource costs (latency, tokens, API limits) when deciding which operator to invoke and how.

Context

Operators have costs: API rate limits, token consumption, latency, monetary cost. Ignoring these leads to budget exhaustion, throttling, or excessive latency.

Forces

  • Different operators have vastly different cost profiles
  • Budget constraints are real — rate limits, token budgets, monetary budgets
  • A cheaper operator may produce acceptable results for low-stakes tasks
  • Cost information must be available at decision time, not discovered after invocation

Structure

The kernel tracks resource budgets and operator costs. Before invoking an operator, it checks:

  • Is the budget sufficient?
  • Is the operator within rate limits?
  • Is a cheaper alternative available?
  • Should this invocation be batched or deferred?

Dynamics

The kernel maintains a real-time resource ledger: tokens consumed, API calls made, cost incurred. Before each operator invocation, the scheduler consults the ledger and the operator’s cost profile. If the budget is sufficient, the invocation proceeds and the ledger is updated. If the budget is low, the scheduler may select a cheaper alternative, batch the invocation with others, or defer it to a lower-priority queue. Rate-limited operators include a backoff window in their cost profile.

Benefits

Sustainable execution. Cost control. Graceful behavior under resource pressure.

Tradeoffs

Cost tracking adds overhead. Cost estimates may be inaccurate, leading to either over-cautious scheduling or budget overruns.

Failure Modes

Cost profiles are stale — the operator’s actual cost has changed but the registry has not been updated. The system defers critical invocations to save budget, degrading result quality. Resource accounting is not thread-safe, allowing parallel workers to collectively exceed the budget.

Resource Envelope, Context Budget Enforcement


Applicability Guide

Operator patterns structure how the system interacts with the external world. Start with the minimum tooling surface and expand deliberately.

Decision Matrix

PatternApply WhenDo Not Apply When
Tool as OperatorThe system needs to interact with external services, files, or APIs through a uniform interfaceThe system is purely reasoning-based with no external side effects
Operator RegistryYou have 5+ tools; workers need to discover tools dynamically; governance must scope tool accessYou have 1-2 hardcoded tools that every worker always uses
Skill over OperatorsRecurring multi-step workflows benefit from packaged instructions, strategies, and tool selectionsEvery task is novel; no workflow repeats enough to justify packaging
Composable Operator ChainMulti-stage operations where one tool’s output feeds the next (e.g., search → fetch → extract)Each tool invocation is independent; composition adds indirection without value
Operator IsolationTool failures should not crash the worker; untrusted tools need sandboxingAll tools are trusted, well-tested, and fast; isolation overhead is not justified
Operator FallbackPrimary tools have reliability issues; alternative providers existEach tool is unique with no equivalent alternative; or reliability is already sufficient
Resource-Aware InvocationTools have rate limits, costs, or latency constraints that require budgetingTools are free, unlimited, and fast; cost tracking adds overhead without benefit

Start Here

Every system needs Tool as Operator (a structured interface to external capabilities). Add the Operator Registry once you have more than a handful of tools. Add Skill over Operators when you notice teams repeatedly assembling the same tool combinations for similar tasks. The other patterns respond to operational pressures — add them when you observe the specific problem they solve.