Horizontal Scaling

ANIP services scale horizontally by running multiple stateless replicas behind a load balancer, sharing a PostgreSQL database. No cluster-wide reconfiguration is needed to add or remove replicas.

Architecture

Any replica can handle any request. Coordination happens through lease tables in PostgreSQL:

Exclusive invocation locks prevent duplicate execution of the same capability for the same principal across replicas
Leader election ensures only one replica generates checkpoints at a time
Shared audit log all replicas write to the same audit table

Setup

The only change from single-instance to cluster is the storage DSN:

Python
TypeScript
Go
Java
C#

service = ANIPService(
    service_id="my-service",
    capabilities=[...],
    storage="postgres://user:pass@host:5432/anip",
    trust="signed",
    key_path="/etc/anip-keys",
    checkpoint_policy=CheckpointPolicy(interval_seconds=60),
    authenticate=...,
)

const service = createANIPService({
  serviceId: "my-service",
  capabilities: [...],
  storage: "postgres://user:pass@host:5432/anip",
  trust: "signed",
  keyPath: "/etc/anip-keys",
  checkpointPolicy: { intervalSeconds: 60 },
  authenticate: ...,
});

svc, _ := service.New(service.Config{
    ServiceID:        "my-service",
    Capabilities:     capabilities,
    Storage:          "postgres://user:pass@host:5432/anip",
    Trust:            "signed",
    KeyPath:          "/etc/anip-keys",
    CheckpointPolicy: service.CheckpointPolicy{IntervalSeconds: 60},
    Authenticate:     authenticate,
})

new ANIPService(new ServiceConfig()
    .setServiceId("my-service")
    .setCapabilities(capabilities)
    .setStorage("postgres://user:pass@host:5432/anip")
    .setTrust("signed")
    .setKeyPath("/etc/anip-keys")
    .setCheckpointPolicy(new CheckpointPolicy().setIntervalSeconds(60))
    .setAuthenticate(authenticate));

var service = new AnipService(new ServiceConfig {
    ServiceId = "my-service",
    Capabilities = capabilities,
    Storage = "postgres://user:pass@host:5432/anip",
    Trust = "signed",
    KeyPath = "/etc/anip-keys",
    CheckpointPolicy = new CheckpointPolicy { IntervalSeconds = 60 },
    Authenticate = authenticate,
});

The runtime creates all required tables automatically on first connection. No manual database setup is needed — just point it at an empty PostgreSQL database.

Signing key distribution

All replicas must use the same signing key material. Options:

Kubernetes Secret (recommended)

volumes:
  - name: anip-keys
    secret:
      secretName: anip-signing-key
containers:
  - name: anip
    volumeMounts:
      - name: anip-keys
        mountPath: /etc/anip-keys
        readOnly: true
    env:
      - name: ANIP_KEY_PATH
        value: /etc/anip-keys
      - name: ANIP_STORAGE
        value: postgres://user:pass@postgres:5432/anip

KMS-backed

For AWS KMS, GCP Cloud KMS, or HashiCorp Vault — the key material never leaves the KMS boundary. Custom KeyManager implementations can delegate signing to the external service.

What the runtime handles automatically

When you switch to PostgreSQL, the runtime handles all cluster coordination for you. You don't need to configure or think about any of this — it just works:

Checkpoints are generated by one replica at a time (automatic leader election). If that replica goes down, another takes over on the next tick. No manual intervention.
Audit retention runs safely on all replicas simultaneously — cleaning up expired entries is idempotent.
Duplicate prevention — if the same invocation request hits two replicas simultaneously, only one executes it. The runtime uses short-lived locks in PostgreSQL to prevent double execution.

If you have long-running capability handlers (over 60 seconds), increase the lock timeout:

ANIPService(
    ...,
    storage="postgres://...",
    exclusive_ttl=120,  # seconds, default is 60
)

Otherwise, no cluster configuration is needed beyond the PostgreSQL connection string.

What stays the same

Scaling from one replica to many changes nothing about the protocol surface:

Same 9 HTTP endpoints
Same manifest, same signature
Same delegation tokens (verified by any replica)
Same audit log (shared in PostgreSQL)
Same checkpoints (generated by elected leader)
Same JWKS (same key material)

Clients and agents don't know or care how many replicas exist behind the load balancer.

Next steps

Configuration — Storage, auth, and trust setup
Observability — Logging, metrics, and tracing hooks
Deployment guide — Full deployment reference

Architecture​

Setup​

Signing key distribution​

Kubernetes Secret (recommended)​

KMS-backed​

What the runtime handles automatically​

What stays the same​

Next steps​