Docs>Voice Agent Connector

Developer Docs

Voice Agent Connector

Use nc-connect to bridge a NumberClaw voice number to your own agent runtime. NumberClaw handles PSTN, speech-to-text, text-to-speech, and voice activity detection. Your side only deals with text.

01

Overview

NumberClaw sits between the phone network and your agent. When a caller speaks, NumberClaw converts the audio into text, sends that text to your connector over WebSocket, waits for your reply, and speaks the reply back to the caller.

architecture
┌──────────┐     PSTN      ┌──────────────────┐    WebSocket     ┌──────────────┐
│  Caller  │ ◄───────────► │   NumberClaw     │ ◄──────────────► │ Your Machine │
│  (phone) │               │  STT ↔ TTS ↔ VAD │   (text only)   │  nc-connect  │
└──────────┘               └──────────────────┘                  │  ↓           │
                                                                 │  handler.py  │
                                                                 │  ↓           │
                                                                 │  Your LLM    │
                                                                 └──────────────┘
  • NumberClaw converts speech to text and text back to speech.
  • Your agent receives plain text and returns plain text.
  • You do not need to handle audio streaming, PSTN, or telephony codecs.
  • NumberClaw never needs your LLM keys, your system prompt, or your private memory store.
02

Quick Start

Prerequisites:

  • Python 3.10 or newer.
  • A NumberClaw account with a Voice AI number.
  • An API key copied from the dashboard.

From zero to a working connector in about five minutes:

bash
# Install
pip install numberclaw

# Option A: Claude on a phone number
nc-connect --key YOUR_NC_KEY --handler anthropic --llm-key YOUR_ANTHROPIC_KEY

# Option B: Point at your existing API
nc-connect --key YOUR_NC_KEY --handler webhook --url http://localhost:8080/chat

# Option C: Custom Python handler
nc-connect --key YOUR_NC_KEY --handler custom --script handler.py

The CLI flags shown above come directly from the current connector implementation. The required flags are --key and --handler. The--llm-key, --url, and --script flags depend on the handler you choose.

03

Handler Interface

Your custom script must export a callable named handle_message.

python
async def handle_message(
    text: str,
    number: str,
    caller: str,
    caller_name: str | None,
    session_id: str,
    channel: str,
    **kwargs,
) -> str | HandlerResult | dict:
ParameterTypeDescription
textstrThe transcribed caller speech.
numberstrThe called NumberClaw number in E.164 format, for example `+12125551234`.
callerstrThe caller phone number in E.164 format.
caller_namestr | NoneDisplay name from caller mappings, if one exists.
session_idstrStable identifier for the active call session.
channelstrCommunication channel. Currently always `voice`.

Supported return types:

  • str for the simplest text response.
  • HandlerResult(text="...", transcript_posted=True | False) for explicit transcript control.
  • {"text": "...", "transcript_posted": true | false} if you prefer a dict.

NumberClaw speaks the returned text back to the caller. Iftranscript_posted is false, NumberClaw can also run its own transcript-sync path for configured messaging integrations.

04

Protocol Reference

The connector protocol is intentionally small. It only moves text messages and call-control events.

TypeDirectionPurpose
authClient → ServerAuthenticate the connector using your NumberClaw API key.
auth_okServer → ClientConfirms the connector is live and returns the assigned agent name and numbers.
user_messageServer → ClientTranscribed caller speech plus routing metadata.
agent_responseClient → ServerYour handler response as plain text plus transcript sync preference.
cancelServer → ClientSent when the call ends or the in-flight message should be abandoned.
ping / pongBidirectionalHeartbeat frames to keep the connection healthy.

Authentication frame:

json
{"type": "auth", "api_key": "nc_live_xxx"}
FieldTypeDescription
typestringAlways `auth` for the initial client authentication frame.
api_keystringYour NumberClaw API key, for example `nc_live_xxx`.

Authentication success:

json
{"type": "auth_ok", "agent_name": "My Agent", "numbers": ["+12125551234"]}
FieldTypeDescription
typestringAlways `auth_ok` when the server accepts the connector.
agent_namestringDisplay name configured for the connector session.
numbersstring[]E.164 numbers currently assigned to the connector.

Inbound user message:

json
{
  "type": "user_message",
  "id": "m1",
  "text": "Hello, I need help with my order",
  "number": "+12125551234",
  "caller": "+14155550123",
  "caller_name": "Alice",
  "session_id": "sess-abc123",
  "channel": "voice"
}
FieldTypeDescription
typestringAlways `user_message`.
idstringUnique message id that must be echoed back in `agent_response`.
textstringCurrent caller utterance after speech-to-text.
numberstringThe NumberClaw number the caller dialed.
callerstringCaller ID in E.164 format.
caller_namestring | NoneResolved display name from caller mappings, when available.
session_idstringStable call session identifier.
channelstringCurrently `voice`.
source_metadataobject | nullOptional source metadata forwarded to your handler as `source_metadata`.

Outbound agent response:

json
{
  "type": "agent_response",
  "id": "m1",
  "text": "Hi Alice! I'd be happy to help with your order.",
  "transcript_posted": false
}
FieldTypeDescription
typestringAlways `agent_response`.
idstringThe `user_message.id` you are replying to.
textstringPlain text for NumberClaw to speak back to the caller.
transcript_postedbooleanSet `true` if your handler already posted the transcript to its own channel.

Cancel event:

json
{"type": "cancel", "id": "m1", "reason": "caller_hangup"}
FieldTypeDescription
typestringAlways `cancel`.
idstringThe in-flight message id to stop processing.
reasonstringWhy the response should be abandoned, such as `caller_hangup`.

Heartbeat:

json
{"type": "ping"}
{"type": "pong"}
FieldTypeDescription
typestringEither `ping` or `pong` to keep the WebSocket healthy.
05

Built-in Handlers

The current CLI supports four handler types: anthropic, openrouter, webhook, and custom.

Anthropic

bash
nc-connect --key NC_KEY --handler anthropic --llm-key sk-ant-xxx
# Optional: --model claude-sonnet-4 --system-prompt "You are a receptionist"

Environment variables: ANTHROPIC_API_KEY, NC_MODEL (default claude-sonnet-4), NC_SYSTEM_PROMPT, NC_MAX_TOKENS.

OpenRouter

bash
nc-connect --key NC_KEY --handler openrouter --llm-key sk-or-xxx
# Optional: --model anthropic/claude-sonnet-4

Environment variables: OPENROUTER_API_KEY, NC_MODEL (default anthropic/claude-sonnet-4), NC_SYSTEM_PROMPT, NC_MAX_TOKENS.

Webhook

bash
nc-connect --key NC_KEY --handler webhook --url http://localhost:8080/chat

Environment variable: NC_WEBHOOK_URL. The handler sends a JSON payload with text, number, caller, caller_name, session_id, and channel. If extra metadata exists, it is sent under metadata. The endpoint should return {"response":"..."} or {"text":"..."}.

06

Custom Handler Guide

Use the custom mode when you already have an agent runtime or you want direct control over prompts, memory, tools, transcript policy, and external services.

Minimal example:

python
# handler.py
async def handle_message(text, number, caller, caller_name, session_id, channel, **kwargs):
    return f"You said: {text}"
bash
nc-connect --key NC_KEY --handler custom --script handler.py

A more advanced reference exists in examples/openclaw_handler.py. That example shows how to combine conversation history, prompt bootstrap, recall, and asynchronous transcript posting while still returning a plain HandlerResult to NumberClaw.

Webhook equivalents if you want to keep your agent behind HTTP:

python
from flask import Flask, jsonify, request

app = Flask(__name__)


@app.post("/chat")
def chat():
    data = request.get_json() or {}
    return jsonify({"response": f"Hello {data.get('caller_name', 'caller')}"})
javascript
const express = require('express');
const app = express();

app.use(express.json());

app.post('/chat', (req, res) => {
  const { text, caller_name } = req.body;
  res.json({ response: `Hello ${caller_name || 'caller'}, you said: ${text}` });
});

app.listen(8080);
07

Caller ID Whitelisting

Voice AI numbers are intentionally restricted. Only registered caller IDs can reach the agent. This is not an optional recommendation. It is how the connector model keeps usage bounded and predictable.

  • Add a contact in the dashboard for a Voice AI number.
  • The contact's caller ID becomes eligible to reach the agent.
  • Everyone else is rejected before the call is routed into your connector.

Manage this from the dashboard path Numbers → selected number → Integrations → Caller Mappings.

08

Transcript Sync

The connector supports two transcript modes:

  • Agent-managed: your handler posts the transcript itself and returns transcript_posted=True.
  • NC-managed: your handler returns a plain string or transcript_posted=False, and NumberClaw handles its own sync path.
python
from nc_connect.connector import HandlerResult

async def call_my_llm(text: str) -> str:
    return f"Thanks for calling. You said: {text}"

async def post_to_my_telegram(caller: str, caller_text: str, response_text: str) -> None:
    print(caller, caller_text, response_text)

async def handle_message(text, number, caller, caller_name, session_id, channel, **kwargs):
    del number, caller_name, session_id, channel, kwargs
    response = await call_my_llm(text)
    await post_to_my_telegram(caller, text, response)
    return HandlerResult(text=response, transcript_posted=True)

Use agent-managed mode if you already have your own Telegram, Slack, or Discord delivery path. Use NC-managed mode if you want the dashboard-configured messaging integrations to receive the transcript.

09

Error Handling

  • Handler timeout: built-in handlers use request timeouts and return a short fallback such asI need a moment. Could you try again? or a generic processing message.
  • Handler exception: the connector logs the exception and sends a sanitized response instead of leaking internal details to the caller.
  • Disconnect: nc-connect automatically reconnects with exponential backoff starting at 1 second and capping at 30 seconds.
  • Caller hangs up: NumberClaw sends a cancel frame, and the connector cancels the in-flight task for that message.

The connector is designed to stay alive after individual handler failures. A bad call should not take down the whole connection.

10

FAQ

Do I need a public URL?

No. `nc-connect` opens an outbound WebSocket to NumberClaw, so it works behind NAT and normal firewalls.

Can I use my own LLM?

Yes. That is the point of the connector. You can use Anthropic, OpenRouter, a webhook, or a custom handler.

What if my agent is offline?

Callers hear your configured unavailable behavior on the NumberClaw side. NumberClaw does not fall back to running your LLM for you.

Can anyone call my number?

Only registered caller IDs can reach the agent. Unregistered callers are rejected before they get through.

What does NumberClaw see?

NumberClaw sees transcribed text and your agent's response. It does not see your LLM keys, your system prompt, or your conversation history outside what you choose to send back.

Next Step

Get a number and connect your agent

The connector stays static and text-only by design. Your agent stack remains yours.