Docs>Voice Agent Connector

Developer Docs

Voice Agent Connector

Use nc-connect to bridge a NumberClaw voice number to your own agent runtime. NumberClaw handles PSTN, speech-to-text, text-to-speech, and voice activity detection. Your side only deals with text.

Overview

NumberClaw sits between the phone network and your agent. When a caller speaks, NumberClaw converts the audio into text, sends that text to your connector over WebSocket, waits for your reply, and speaks the reply back to the caller.

architecture

┌──────────┐     PSTN      ┌──────────────────┐    WebSocket     ┌──────────────┐
│  Caller  │ ◄───────────► │   NumberClaw     │ ◄──────────────► │ Your Machine │
│  (phone) │               │  STT ↔ TTS ↔ VAD │   (text only)   │  nc-connect  │
└──────────┘               └──────────────────┘                  │  ↓           │
                                                                 │  handler.py  │
                                                                 │  ↓           │
                                                                 │  Your LLM    │
                                                                 └──────────────┘

NumberClaw converts speech to text and text back to speech.
Your agent receives plain text and returns plain text.
You do not need to handle audio streaming, PSTN, or telephony codecs.
NumberClaw never needs your LLM keys, your system prompt, or your private memory store.

Quick Start

Prerequisites:

Python 3.10 or newer.
A NumberClaw account with a Voice AI number.
An API key copied from the dashboard.

From zero to a working connector in about five minutes:

bash

# Install
pip install numberclaw

# Option A: Claude on a phone number
nc-connect --key YOUR_NC_KEY --handler anthropic --llm-key YOUR_ANTHROPIC_KEY

# Option B: Point at your existing API
nc-connect --key YOUR_NC_KEY --handler webhook --url http://localhost:8080/chat

# Option C: Custom Python handler
nc-connect --key YOUR_NC_KEY --handler custom --script handler.py

The CLI flags shown above come directly from the current connector implementation. The required flags are --key and --handler. The--llm-key, --url, and --script flags depend on the handler you choose.

Handler Interface

Your custom script must export a callable named handle_message.

python

async def handle_message(
    text: str,
    number: str,
    caller: str,
    caller_name: str | None,
    session_id: str,
    channel: str,
    **kwargs,
) -> str | HandlerResult | dict:

Parameter	Type	Description
text	str	The transcribed caller speech.
number	str	The called NumberClaw number in E.164 format, for example `+12125551234`.
caller	str	The caller phone number in E.164 format.
caller_name	str \| None	Display name from caller mappings, if one exists.
session_id	str	Stable identifier for the active call session.
channel	str	Communication channel. Currently always `voice`.

Supported return types:

str for the simplest text response.
HandlerResult(text="...", transcript_posted=True | False) for explicit transcript control.
{"text": "...", "transcript_posted": true | false} if you prefer a dict.

NumberClaw speaks the returned text back to the caller. Iftranscript_posted is false, NumberClaw can also run its own transcript-sync path for configured messaging integrations.

Protocol Reference

The connector protocol is intentionally small. It only moves text messages and call-control events.

Type	Direction	Purpose
auth	Client → Server	Authenticate the connector using your NumberClaw API key.
auth_ok	Server → Client	Confirms the connector is live and returns the assigned agent name and numbers.
user_message	Server → Client	Transcribed caller speech plus routing metadata.
agent_response	Client → Server	Your handler response as plain text plus transcript sync preference.
cancel	Server → Client	Sent when the call ends or the in-flight message should be abandoned.
ping / pong	Bidirectional	Heartbeat frames to keep the connection healthy.

Authentication frame:

json

{"type": "auth", "api_key": "nc_live_xxx"}

Field	Type	Description
type	string	Always `auth` for the initial client authentication frame.
api_key	string	Your NumberClaw API key, for example `nc_live_xxx`.

Authentication success:

json

{"type": "auth_ok", "agent_name": "My Agent", "numbers": ["+12125551234"]}

Field	Type	Description
type	string	Always `auth_ok` when the server accepts the connector.
agent_name	string	Display name configured for the connector session.
numbers	string[]	E.164 numbers currently assigned to the connector.

Inbound user message:

json

{
  "type": "user_message",
  "id": "m1",
  "text": "Hello, I need help with my order",
  "number": "+12125551234",
  "caller": "+14155550123",
  "caller_name": "Alice",
  "session_id": "sess-abc123",
  "channel": "voice"
}

Field	Type	Description
type	string	Always `user_message`.
id	string	Unique message id that must be echoed back in `agent_response`.
text	string	Current caller utterance after speech-to-text.
number	string	The NumberClaw number the caller dialed.
caller	string	Caller ID in E.164 format.
caller_name	string \| None	Resolved display name from caller mappings, when available.
session_id	string	Stable call session identifier.
channel	string	Currently `voice`.
source_metadata	object \| null	Optional source metadata forwarded to your handler as `source_metadata`.

Outbound agent response:

json

{
  "type": "agent_response",
  "id": "m1",
  "text": "Hi Alice! I'd be happy to help with your order.",
  "transcript_posted": false
}

Field	Type	Description
type	string	Always `agent_response`.
id	string	The `user_message.id` you are replying to.
text	string	Plain text for NumberClaw to speak back to the caller.
transcript_posted	boolean	Set `true` if your handler already posted the transcript to its own channel.

Cancel event:

json

{"type": "cancel", "id": "m1", "reason": "caller_hangup"}

Field	Type	Description
type	string	Always `cancel`.
id	string	The in-flight message id to stop processing.
reason	string	Why the response should be abandoned, such as `caller_hangup`.

Heartbeat:

json

{"type": "ping"}
{"type": "pong"}

Field	Type	Description
type	string	Either `ping` or `pong` to keep the WebSocket healthy.

Built-in Handlers

The current CLI supports four handler types: anthropic, openrouter, webhook, and custom.

Anthropic

bash

nc-connect --key NC_KEY --handler anthropic --llm-key sk-ant-xxx
# Optional: --model claude-sonnet-4 --system-prompt "You are a receptionist"

Environment variables: ANTHROPIC_API_KEY, NC_MODEL (default claude-sonnet-4), NC_SYSTEM_PROMPT, NC_MAX_TOKENS.

OpenRouter

bash

nc-connect --key NC_KEY --handler openrouter --llm-key sk-or-xxx
# Optional: --model anthropic/claude-sonnet-4

Environment variables: OPENROUTER_API_KEY, NC_MODEL (default anthropic/claude-sonnet-4), NC_SYSTEM_PROMPT, NC_MAX_TOKENS.

Webhook

bash

nc-connect --key NC_KEY --handler webhook --url http://localhost:8080/chat

Environment variable: NC_WEBHOOK_URL. The handler sends a JSON payload with text, number, caller, caller_name, session_id, and channel. If extra metadata exists, it is sent under metadata. The endpoint should return {"response":"..."} or {"text":"..."}.

Custom Handler Guide

Use the custom mode when you already have an agent runtime or you want direct control over prompts, memory, tools, transcript policy, and external services.

Minimal example:

python

# handler.py
async def handle_message(text, number, caller, caller_name, session_id, channel, **kwargs):
    return f"You said: {text}"

bash

nc-connect --key NC_KEY --handler custom --script handler.py

A more advanced reference exists in examples/openclaw_handler.py. That example shows how to combine conversation history, prompt bootstrap, recall, and asynchronous transcript posting while still returning a plain HandlerResult to NumberClaw.

Webhook equivalents if you want to keep your agent behind HTTP:

python

from flask import Flask, jsonify, request

app = Flask(__name__)


@app.post("/chat")
def chat():
    data = request.get_json() or {}
    return jsonify({"response": f"Hello {data.get('caller_name', 'caller')}"})

javascript

const express = require('express');
const app = express();

app.use(express.json());

app.post('/chat', (req, res) => {
  const { text, caller_name } = req.body;
  res.json({ response: `Hello ${caller_name || 'caller'}, you said: ${text}` });
});

app.listen(8080);

Caller ID Whitelisting

Voice AI numbers are intentionally restricted. Only registered caller IDs can reach the agent. This is not an optional recommendation. It is how the connector model keeps usage bounded and predictable.

Add a contact in the dashboard for a Voice AI number.
The contact's caller ID becomes eligible to reach the agent.
Everyone else is rejected before the call is routed into your connector.

Manage this from the dashboard path Numbers → selected number → Integrations → Caller Mappings.

Transcript Sync

The connector supports two transcript modes:

Agent-managed: your handler posts the transcript itself and returns transcript_posted=True.
NC-managed: your handler returns a plain string or transcript_posted=False, and NumberClaw handles its own sync path.

python

from nc_connect.connector import HandlerResult

async def call_my_llm(text: str) -> str:
    return f"Thanks for calling. You said: {text}"

async def post_to_my_telegram(caller: str, caller_text: str, response_text: str) -> None:
    print(caller, caller_text, response_text)

async def handle_message(text, number, caller, caller_name, session_id, channel, **kwargs):
    del number, caller_name, session_id, channel, kwargs
    response = await call_my_llm(text)
    await post_to_my_telegram(caller, text, response)
    return HandlerResult(text=response, transcript_posted=True)

Use agent-managed mode if you already have your own Telegram, Slack, or Discord delivery path. Use NC-managed mode if you want the dashboard-configured messaging integrations to receive the transcript.

Error Handling

Handler timeout: built-in handlers use request timeouts and return a short fallback such asI need a moment. Could you try again? or a generic processing message.
Handler exception: the connector logs the exception and sends a sanitized response instead of leaking internal details to the caller.
Disconnect: nc-connect automatically reconnects with exponential backoff starting at 1 second and capping at 30 seconds.
Caller hangs up: NumberClaw sends a cancel frame, and the connector cancels the in-flight task for that message.

The connector is designed to stay alive after individual handler failures. A bad call should not take down the whole connection.

FAQ

Do I need a public URL?

No. `nc-connect` opens an outbound WebSocket to NumberClaw, so it works behind NAT and normal firewalls.

Can I use my own LLM?

Yes. That is the point of the connector. You can use Anthropic, OpenRouter, a webhook, or a custom handler.

What if my agent is offline?

Callers hear your configured unavailable behavior on the NumberClaw side. NumberClaw does not fall back to running your LLM for you.

Can anyone call my number?

Only registered caller IDs can reach the agent. Unregistered callers are rejected before they get through.

What does NumberClaw see?

NumberClaw sees transcribed text and your agent's response. It does not see your LLM keys, your system prompt, or your conversation history outside what you choose to send back.

Next Step

Get a number and connect your agent

The connector stays static and text-only by design. Your agent stack remains yours.

Create Account Back to Voice AI