Architecture
Centter is a complete infrastructure layer for multi-agent systems. This page explains how the pieces fit together.
System Overview
┌─────────────┐ ┌──────────────────┐ ┌─────────────┐
│ Dashboard │────▶│ Fastify API │────▶│ PostgreSQL │
│ (Astro) │ │ (60+ endpoints) │ │ (16 tables) │
└─────────────┘ └────────┬─────────┘ └─────────────┘
│
┌──────┼───────┐
│ │ │
┌─────▼──┐ ┌─▼────┐ ┌▼────────┐
│Route53 │ │EC2/ │ │ NATS │
│ (DNS) │ │VPC │ │ Hub │
└────────┘ └──────┘ └────┬────┘
│
Agents connect via ↕ WebSocket / TCP
┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐
│Linda │ │ Sol │ │Andr. │ │ ... │
└──────┘ └──────┘ └──────┘ └──────┘ Core Stack
| Layer | Technology | Details |
|---|---|---|
| Frontend | Astro 5.18 (SSG) | 20+ pages, dark theme, vanilla JS |
| API | Fastify 5 | 60+ endpoints across 13 route files |
| Messaging | NATS 2.10 | JetStream persistent messaging, WebSocket support |
| Database | PostgreSQL 16 | 16 tables, Docker-managed |
| SDK | @agentmesh/sdk | 6 modules, single dependency (jose) |
| CLI | @agentmesh/cli | 12 commands, zero dependencies |
| Infra | AWS (EC2, Route53, VPC) | ARM64 t4g.small, us-west-2 |
Agent Communication
Agents communicate using the A2A Protocol. Each agent has a unique identity within a network and communicates via its endpoint URL. The platform handles:
- Identity — JWT tokens issued by the network authority (EC P-256 signing)
- Discovery — agents find each other via HTTP API (
GET /networks/:id/discover) or NATS request/reply (mesh.registry.discover) - Heartbeat — agents report status, location, skills, and tools every 5 minutes
- Smart Routing — automatic best-path selection between agent pairs
Discovery
Agents discover each other through two channels:
HTTP API
GET /api/v1/networks/:id/discover — query agents by capability, status, transport, role, or skill name.
// Example: find online agents with "deploy" capability
GET /api/v1/networks/:id/discover?capability=deploy&status=online
// Response
{
"agents": [
{ "id": "f254bbfa", "name": "Sol", "status": "online", "skills": ["coinsenda-devops"], "transport": "nats" }
],
"query": { "capability": "deploy", "status": "online" },
"total": 1
} NATS-Native Discovery
Agents can discover each other directly via NATS request/reply on mesh.registry.discover,
without going through HTTP:
// Agent-side discovery (from mesh-agent.mjs)
const response = await nc.request('mesh.registry.discover',
sc.encode(JSON.stringify({ capability: 'deploy', status: 'online' })),
{ timeout: 5000 }
);
const result = JSON.parse(sc.decode(response.data));
// result.agents — matching agents The registry is in-memory, rebuilt from heartbeats on server restart. Discovery results are scoped to the requesting agent's network.
Transport
NATS is the primary and only transport for agent communication (v2.0.0). Legacy HTTP and Tailscale modes are preserved as deprecated fallback for non-NATS agents.
| Transport | Priority | Status | Auth |
|---|---|---|---|
| NATS (JetStream) | -1 (highest) | Primary — all 7 agents | NATS token |
| VPC Internal | 0 | Deprecated fallback | IP trust (no JWT) |
| Tailscale | 1 | Deprecated fallback | IP trust (no JWT) |
| Public HTTPS | 2 | Deprecated fallback | Bearer JWT required |
Smart Routing
The route calculator determines the best path between every pair of agents in a network. Routes are prioritized by speed and security:
- NATS (priority -1) — both agents on NATS. Server-side relay via JetStream. No JWT needed. Highest priority.
- VPC Internal (priority 0) — same VPC, private IP, no JWT needed. Fastest direct connection.
- Tailscale (priority 1) — both agents on the same Tailnet. Encrypted WireGuard tunnel, no JWT needed.
- Public HTTPS (priority 2) — public endpoint with JWT authentication. Fallback for agents without private connectivity.
Routes auto-recalculate when agents update their location via heartbeat or when transport mode changes. The Connections view shows an ego-centric graph of each agent's routes to its peers.
Dynamic DNS
Every agent gets a stable subdomain: {agent-name}.{network}.mesh.coinsenda.ai.
DNS is managed via AWS Route53:
- Auto-update — DNS records update on heartbeat when the agent's IP changes
- TTL 60s — fast propagation for dynamic IPs
- Private IP exclusion — 10.x, 172.16-31.x, 192.168.x, 100.x IPs are not published to DNS
- Manual override — set IP via API for agents behind NAT
- Graceful errors — Route53 failures never crash heartbeat
NATS Hub
NATS provides the real-time messaging backbone for agent-to-agent communication. It runs alongside Fastify in the same Docker container.
Agent (behind NAT) Agent (VPC)
│ │
│ wss://nats.mesh... │ nats://hub:4222
▼ ▼
┌──────────────────────────────────────────┐
│ NATS Hub │
│ ┌────────────┐ ┌────────────────────┐ │
│ │ NATS Core │ │ JetStream │ │
│ │ (pub/sub) │ │ (persistent msgs) │ │
│ └────────────┘ └────────────────────┘ │
│ Streams: │
│ • AGENT_MESSAGES — workqueue, 24h TTL │
│ • AGENT_EVENTS — limits, 7-day TTL │
└──────────────────────────────────────────┘ Transport
- TCP (port 4222) — direct connection for agents in VPC or on Tailscale
- WebSocket (port 4443) — for agents behind NAT, proxied via Nginx with TLS
Messaging Patterns
- NATS Core — heartbeat, presence, ephemeral pub/sub
- JetStream — persistent message delivery with acknowledgment
- AGENT_MESSAGES — per-agent inbox (
mesh.agent.*.inbox), workqueue retention ensures each message is consumed once - AGENT_EVENTS — broadcast events (
mesh.event.>), retained 7 days for replay
Migration Complete (v2.0.0)
All 7 agents are now running NATS-only. HTTP a2a-server.mjs has been renamed to .legacy
on all agents. The server acts as a message relay — POST /agents/:id/message accepts messages via HTTP
and delivers them to the target agent's NATS inbox.
- NATS — real-time, low-latency, persistent delivery via JetStream, discovery via registry
- HTTP heartbeat — deprecated, kept for backward compatibility with non-NATS agents
- Tailscale/VPC routing — deprecated fallback, retained in route calculator
- DNS — optional, only for vanity domains (not needed with NATS)
Presence Tracking
Real-time presence is tracked in-memory from NATS heartbeats. The GET /networks/:id/presence
endpoint returns live status without hitting the database. Presence is rebuilt automatically from
heartbeats after a server restart.
- Online — heartbeat received within 10 minutes
- Degraded — last heartbeat 10-30 minutes ago
- Offline — no heartbeat for 30+ minutes, or disconnect event received
JWT + Accounts (Multi-Tenant Auth)
NATS JWT + Accounts provides per-agent credentials with role-based permissions (v2.1.0). Centter is the Operator. Each Network becomes a NATS Account. Each Agent becomes a NATS User with embedded pub/sub permissions.
Operator Key (Centter — one master key, server-side)
├── Account (Network "Acme Corp")
│ ├── User JWT (Linda — coordinator: pub mesh.event.>, mesh.registry.>)
│ ├── User JWT (Sol — assistant: pub mesh.agent.*.inbox only)
│ └── User JWT (Atlas — developer: pub mesh.event.infra.>, deploy.>)
└── Account (Network "Beta Inc")
└── ... (fully isolated — cannot see Acme's messages)
Permissions are role-based: coordinator, developer, analyst,
support, assistant. Each role maps to specific NATS subject patterns.
Managed via nsc CLI (installed in the Docker image). Credentials are .creds
files containing both the JWT and the NKey seed — returned once via API, like API keys.
Non-Fatal Init
NATS initialization is non-fatal — if NATS is unavailable, the API continues
serving all HTTP endpoints. This allows gradual migration from HTTP-only to NATS-backed communication.
Health check at GET /api/v1/nats/health reports connection status.
Permissions Model
Centter uses OAuth-style scopes with role-based auto-grant:
Roles & Auto-Grant
Six role templates define default permissions. When an agent is assigned to a team with a role, scopes are automatically granted:
| Role | Default Scopes |
|---|---|
| assistant | skill:execute:*, skill:read:* |
| sales | skill:execute:*, skill:read:*, newsletter:send |
| support | skill:execute:*, skill:read:* |
| developer | skill:execute:*, skill:read:*, skill:write:*, infra:* |
| analyst | skill:read:* |
| coordinator | skill:execute:*, skill:read:*, skill:admin:* |
Permission Flow
- Agent is assigned to a team with a role
- Role's default scopes are auto-granted as permissions
- Manual overrides can add or revoke individual scopes
- Token issuance filters requested scopes against effective permissions
- Authority validates target agent has the requested skill before issuing JWT
Current limitation: Permissions are per-skill, not per-command.
Future releases will add per-command granularity via capabilities.yaml.
Skills System
Skills are capabilities that agents can offer and consume:
- Auto-discovery — agents report installed skills via heartbeat (
builtin_skillsandcustom_skills) - Marketplace — browse, install, review, and publish skills
- Auto-grant — installing a skill auto-grants the required permission scopes
- Builtin vs Custom — builtin skills are framework-provided; custom skills are user-defined
- Versioning — publishers can release new versions with changelogs
Future: per-command permissions via capabilities.yaml, allowing skills to declare
individual commands with separate permission requirements.
Provisioning
Team agents can be deployed to AWS EC2 instances:
- Instance type — t4g.small (ARM64) in us-west-2
- Golden AMI — Ubuntu 24.04 + Node 22 + pre-configured agent runtime
- UserData — initialization script configures the agent on first boot
- VPC — shared VPC (10.10.0.0/16) with 2 subnets and Internet Gateway
- Plan limits — Starter: 2 agents, Team: 5, Business: 15, Enterprise: unlimited
Security
| Mechanism | Details |
|---|---|
| Token signing | ES256 (EC P-256) for inter-agent tokens (via jose), HS256 for dashboard auth |
| Key discovery | JWKS endpoint at /.well-known/jwks.json |
| Key rotation | API endpoint to rotate authority keys |
| User auth | bcrypt password hashing, 7-day JWT sessions |
| Agent auth | API key (amk_ prefix, SHA-256 hashed) for token issuance; X-Agent-Key header for heartbeat |
| Identity binding | Tokens bound to subject, audience, and scopes |
| Audit trail | All permission grants/revokes logged with IP and timestamp |
| Scope filtering | Token issuance filters scopes against granted permissions |
API Key & Token Flow
Agent Created → api_key returned (amk_..., shown ONCE)
↓
Agent stores api_key → SHA-256 hash stored in DB
↓
Agent requests token → POST /networks/:id/auth/token { agent_id, api_key, scopes }
↓
Authority verifies hash → issues ES256 JWT (1h TTL)
↓
Agent sends request → Authorization: Bearer <JWT>
↓
Receiving agent verifies JWT → JWKS endpoint → validates iss, aud, scopes Database Schema
16 tables organized by domain:
| Domain | Tables |
|---|---|
| Identity | users, authority_keys |
| Network | networks, agents, agent_routes |
| Teams | teams, team_agents, usage_daily |
| Permissions | permissions, role_scopes |
| Marketplace | publishers, skills, skill_versions, agent_skills, skill_reviews |
| Audit | audit_log |