Headless embed: use your own chat button
For developers who want complete control over the user interface, Babelbeez offers a Headless SDK. This lets you build your own buttons, voice visualizers, or entirely custom conversational experiences while Babelbeez handles:
- Audio streaming
- Voice Activity Detection (VAD)
- Connection management with OpenAI Realtime
This method replaces the standard widget and allows you to build a completely custom frontend interface for the voice agent.
Installation
Install the Babelbeez SDK via npm:
bash
npm install @babelbeez/sdkInitialization
Import the client and initialize it with your publicChatbotId. You can find this ID in your Babelbeez dashboard under Settings → Embed.
ts
import { BabelbeezClient } from '@babelbeez/sdk';
const client = new BabelbeezClient({
publicChatbotId: 'YOUR_PUBLIC_CHATBOT_ID_HERE',
});Usage flow
The SDK is event-driven. You subscribe to state changes to update your UI (e.g., changing a button from “Start” to “Listening” or “Speaking”).
1. Listen to state changes
The buttonState event is the primary way to sync your UI with the agent’s status.
ts
// Track state locally to prevent double-clicks during loading
let currentButtonState: string = 'idle';
client.on('buttonState', (state) => {
console.log('Current State:', state);
currentButtonState = state;
// state can be:
// 'idle' - Disconnected, ready to start
// 'loading' - Connecting to server (Disable button!)
// 'active' - Connected, listening for user speech
// 'speaking' - Agent is currently talking
// 'rag-retrieval' - Agent is searching knowledge base
// 'error' - Connection failed
updateMyCustomButton(state);
});You are responsible for implementing updateMyCustomButton(state) to update your own DOM / framework components.
2. Start and end sessions
Control the connection manually using initialize(), startChat(), and endChat().
Note: Always call
initialize()beforestartChat()when starting a new session. This ensures you generate a fresh authentication token for the voice connection.
ts
// Function to handle your custom button click
async function toggleChat() {
// UX Safeguard: Prevent interaction while connecting/disconnecting
if (currentButtonState === 'loading') return;
if (client.isActive) {
// Hang up
await client.endChat();
} else {
// Start session
try {
// 1. Initialize (fetches fresh config & token, prepares audio)
await client.initialize();
// 2. Connect to Realtime API
await client.startChat();
} catch (err) {
console.error('Failed to start chat:', err);
}
}
}Wire this toggleChat() function to any button, icon, or control in your UI.
Example: Custom ringtone
You can easily add custom audio feedback, like a ringtone that plays while connecting and stops when the session starts (or fails).
ts
const ringtone = new Audio('/path/to/ringtone.mp3');
ringtone.loop = true;
// 1. Play ringtone when user clicks start
async function startCallWithRingtone() {
try {
ringtone.currentTime = 0;
await ringtone.play(); // Start playing immediately
await client.initialize();
await client.startChat();
} catch (err) {
console.error('Failed to connect:', err);
// Ensure ringtone stops if initial connection fails
ringtone.pause();
ringtone.currentTime = 0;
}
}
// 2. Stop ringtone on successful connection
client.on('session:start', () => {
ringtone.pause();
ringtone.currentTime = 0;
console.log('Call connected!');
});
// 3. Stop ringtone on error
client.on('error', (err) => {
ringtone.pause();
ringtone.currentTime = 0;
console.error('Connection error:', err);
});Example: Displaying transcripts
You can listen to the transcript event to build a real-time chat interface. The SDK emits events for both the user and the assistant as they speak.
ts
const transcriptContainer = document.getElementById('messages');
let currentLine: HTMLDivElement | null = null;
client.on('transcript', ({ text, isFinal, role }) => {
// role is 'user' or 'assistant'
// 1. Create a new line if the role changed or the previous line was finished
if (!currentLine || currentLine.dataset.role !== role || currentLine.dataset.final === 'true') {
currentLine = document.createElement('div');
currentLine.dataset.role = role;
currentLine.className = role === 'user' ? 'message-user' : 'message-agent';
transcriptContainer!.appendChild(currentLine);
}
// 2. Update the text content
currentLine.textContent = text;
// 3. Mark if the sentence is complete
currentLine.dataset.final = String(isFinal);
// 4. Auto-scroll to bottom
transcriptContainer!.scrollTop = transcriptContainer!.scrollHeight;
});Style .message-user and .message-agent in your CSS to match your app’s design.
Hybrid text & voice
The SDK supports sending text messages directly to the voice agent. This allows for hybrid interfaces where a user can speak or type, and the agent will respond via audio.
ts
// Send a text message as if the user spoke it
// Note: Session must be active to send text
if (client.isActive) {
client.sendUserText('Hello, do you have pricing for teams?');
}Handling handoffs
If your agent is configured to hand off to a human (e.g., asking for an email or WhatsApp), the SDK emits specific events so you can render the appropriate forms.
When handoff:show fires, the Voice Agent pauses and waits for your UI to finish the flow. Under the hood, the request_human_handoff tool creates a Promise and waits for it to be resolved.
Your frontend must call one of:
client.handleHandoffSubmit({ email, consent })– user submitted the formclient.handleHandoffCancel({ viaWhatsapp? })– user cancelled or chose WhatsApp
If you only listen for handoff:show / handoff:hide and never call one of these methods, the tool’s Promise never resolves and the agent will appear to “hang” waiting for the handoff to finish.
Basic event wiring
ts
client.on('handoff:show', (data) => {
// data contains:
// - summaryText: string (short description of the conversation so far)
// - waLink: string | null (pre-filled WhatsApp deeplink, if configured)
showMyHandoffForm(data); // Render your custom modal or form
});
client.on('handoff:hide', (data) => {
// data.outcome will be:
// - 'email_submitted'
// - 'whatsapp_submitted'
// - 'cancelled'
hideMyHandoffForm();
});You are free to design the form, modal, or flow to match your product while the agent logic and triggers remain configured in Babelbeez.
Complete example: custom handoff form
The following example shows a full email + WhatsApp flow that correctly closes the loop with the agent.
js
// 1. Listen for the request and show your modal
client.on('handoff:show', (data) => {
// data contains { summaryText, waLink }
const modal = document.getElementById('my-handoff-modal');
const summaryEl = document.getElementById('summary-preview');
if (summaryEl && data.summaryText) {
summaryEl.textContent = data.summaryText;
}
// Optional: WhatsApp button when waLink is provided
const waBtn = document.getElementById('my-handoff-wa-button');
if (waBtn) {
if (data.waLink) {
waBtn.style.display = 'inline-flex';
waBtn.onclick = () => {
window.open(data.waLink, '_blank', 'noopener');
// Tell the SDK the user continued via WhatsApp (this also ends the session)
client.handleHandoffCancel({ viaWhatsapp: true });
};
} else {
waBtn.style.display = 'none';
waBtn.onclick = null;
}
}
if (modal) {
modal.style.display = 'block';
}
});
// 2. Handle the "Submit" action in your UI
document.getElementById('my-submit-button')?.addEventListener('click', async () => {
const emailInput = document.getElementById('user-email-input');
const consentCheckbox = document.getElementById('consent-checkbox');
const email = emailInput && 'value' in emailInput ? emailInput.value.trim() : '';
const consent = !!(consentCheckbox && 'checked' in consentCheckbox && consentCheckbox.checked);
if (!email || !consent) {
// Show your own validation error UI
return;
}
// IMPORTANT: This resolves the request_human_handoff tool
// and lets the agent continue the conversation.
await client.handleHandoffSubmit({ email, consent });
});
// 3. Handle "Cancel" or closing the modal
document.getElementById('my-cancel-button')?.addEventListener('click', () => {
// Tells the agent the user declined for now; the agent can ask
// if there is anything else they can help with.
client.handleHandoffCancel();
});
// 4. Listen for cleanup from the SDK and hide your UI
client.on('handoff:hide', (data) => {
const modal = document.getElementById('my-handoff-modal');
if (modal) {
modal.style.display = 'none';
}
// data.outcome is 'email_submitted', 'whatsapp_submitted', or 'cancelled'
console.log('Handoff finished with outcome:', data?.outcome);
});This mirrors the behavior of the official headless SDK demo, ensuring the underlying request_human_handoff tool completes and the agent doesn’t get stuck waiting.
Full API reference
Properties
| Property | Type | Description |
|---|---|---|
isActive | boolean | Returns true if the voice session is currently active and connected. |
publicChatbotId | string | The unique identifier for the chatbot instance. |
Methods
| Method | Description |
|---|---|
initialize() | Prepares the audio context and fetches fresh agent configuration/tokens. |
startChat() | Establishes the WebSocket connection to OpenAI. |
endChat(reason?) | Ends the current session. Optional reason string (default: 'user_action'). |
sendUserText(text) | Sends a text message to the agent (processed as user input). |
setMuted(muted) | Mutes (true) or unmutes (false) the user’s microphone. |
handleHandoffSubmit({ email, consent }) | Called when the user submits your handoff form. Sends contact details to your backend and resolves the agent’s request_human_handoff tool as email_submitted. |
handleHandoffCancel({ viaWhatsapp? }) | Called when the user cancels the form or chooses WhatsApp. Resolves the handoff tool as 'cancelled' or 'whatsapp_submitted' and, in the WhatsApp case, ends the session. |
Events
| Event | Payload | Description |
|---|---|---|
buttonState | string | The current functional state (idle, loading, active, speaking, rag-retrieval, error). |
status | { text, isActive } | Human-readable status text and boolean active state. |
error | { message, severity } | Emitted on connection errors or microphone denial. |
transcript | { text, isFinal, role } | Real-time transcript updates for user or assistant. |
handoff:show | { summaryText, waLink } | Triggered when the agent initiates a human handoff flow. summaryText is a short recap of the conversation; waLink is a pre-filled WhatsApp deeplink when configured. |
handoff:hide | { outcome } | Triggered when the handoff flow is completed or cancelled. outcome is 'email_submitted', 'whatsapp_submitted', or 'cancelled'. |
session:start | undefined | Emitted when the WebSocket connection is successfully established. |
session:end | undefined | Emitted when the session ends (user hangs up, error, or agent ends it). |
With this Headless SDK, you can integrate Babelbeez deeply into your own components and design system while leaving the real-time audio and AI orchestration to us.
