Headless embed: use your own chat button

For developers who want complete control over the user interface, Babelbeez offers a Headless SDK. This lets you build your own buttons, voice visualizers, or entirely custom conversational experiences while Babelbeez handles:

Audio streaming
Voice Activity Detection (VAD)
Connection management with OpenAI Realtime

This method replaces the standard widget and allows you to build a completely custom frontend interface for the voice agent.

Installation

Install the Babelbeez SDK via npm:

bash

npm install @babelbeez/sdk

Initialization

Import the client and initialize it with your publicChatbotId. You can find this ID in your Babelbeez dashboard under Settings → Embed.

import { BabelbeezClient } from '@babelbeez/sdk';

const client = new BabelbeezClient({
  publicChatbotId: 'YOUR_PUBLIC_CHATBOT_ID_HERE',
});

Usage flow

The SDK is event-driven. You subscribe to state changes to update your UI (e.g., changing a button from “Start” to “Listening” or “Speaking”).

1. Listen to state changes

The buttonState event is the primary way to sync your UI with the agent’s status.

// Track state locally to prevent double-clicks during loading
let currentButtonState: string = 'idle';

client.on('buttonState', (state) => {
  console.log('Current State:', state);
  currentButtonState = state;
  
  // state can be:
  // 'idle'          - Disconnected, ready to start
  // 'loading'       - Connecting to server (Disable button!)
  // 'active'        - Connected, listening for user speech
  // 'speaking'      - Agent is currently talking
  // 'rag-retrieval' - Agent is searching knowledge base
  // 'error'         - Connection failed
  
  updateMyCustomButton(state);
});

You are responsible for implementing updateMyCustomButton(state) to update your own DOM / framework components.

2. Start and end sessions

Control the connection manually using initialize(), startChat(), and endChat().

Note: Always call initialize() before startChat() when starting a new session. This ensures you generate a fresh authentication token for the voice connection.

// Function to handle your custom button click
async function toggleChat() {
  // UX Safeguard: Prevent interaction while connecting/disconnecting
  if (currentButtonState === 'loading') return;

  if (client.isActive) {
    // Hang up
    await client.endChat();
  } else {
    // Start session
    try {
      // 1. Initialize (fetches fresh config & token, prepares audio)
      await client.initialize();
      // 2. Connect to Realtime API
      await client.startChat();
    } catch (err) {
      console.error('Failed to start chat:', err);
    }
  }
}

Wire this toggleChat() function to any button, icon, or control in your UI.

Example: Custom ringtone

You can easily add custom audio feedback, like a ringtone that plays while connecting and stops when the session starts (or fails).

const ringtone = new Audio('/path/to/ringtone.mp3');
ringtone.loop = true;

// 1. Play ringtone when user clicks start
async function startCallWithRingtone() {
  try {
    ringtone.currentTime = 0;
    await ringtone.play(); // Start playing immediately
    
    await client.initialize();
    await client.startChat();
  } catch (err) {
    console.error('Failed to connect:', err);
    // Ensure ringtone stops if initial connection fails
    ringtone.pause();
    ringtone.currentTime = 0;
  }
}

// 2. Stop ringtone on successful connection
client.on('session:start', () => {
  ringtone.pause();
  ringtone.currentTime = 0;
  console.log('Call connected!');
});

// 3. Stop ringtone on error
client.on('error', (err) => {
  ringtone.pause();
  ringtone.currentTime = 0;
  console.error('Connection error:', err);
});

Example: Displaying transcripts

You can listen to the transcript event to build a real-time chat interface. The SDK emits events for both the user and the assistant as they speak.

const transcriptContainer = document.getElementById('messages');
let currentLine: HTMLDivElement | null = null;

client.on('transcript', ({ text, isFinal, role }) => {
  // role is 'user' or 'assistant'
  
  // 1. Create a new line if the role changed or the previous line was finished
  if (!currentLine || currentLine.dataset.role !== role || currentLine.dataset.final === 'true') {
    currentLine = document.createElement('div');
    currentLine.dataset.role = role;
    currentLine.className = role === 'user' ? 'message-user' : 'message-agent';
    transcriptContainer!.appendChild(currentLine);
  }

  // 2. Update the text content
  currentLine.textContent = text;
  
  // 3. Mark if the sentence is complete
  currentLine.dataset.final = String(isFinal);

  // 4. Auto-scroll to bottom
  transcriptContainer!.scrollTop = transcriptContainer!.scrollHeight;
});

Style .message-user and .message-agent in your CSS to match your app’s design.

Hybrid text & voice

The SDK supports sending text messages directly to the voice agent. This allows for hybrid interfaces where a user can speak or type, and the agent will respond via audio.

// Send a text message as if the user spoke it
// Note: Session must be active to send text
if (client.isActive) {
  client.sendUserText('Hello, do you have pricing for teams?');
}

Handling handoffs

If your agent is configured to hand off to a human (e.g., asking for an email or WhatsApp), the SDK emits specific events so you can render the appropriate forms.

When handoff:show fires, the Voice Agent pauses and waits for your UI to finish the flow. Under the hood, the request_human_handoff tool creates a Promise and waits for it to be resolved.

Your frontend must call one of:

client.handleHandoffSubmit({ email, consent }) – user submitted the form
client.handleHandoffCancel({ viaWhatsapp? }) – user cancelled or chose WhatsApp

If you only listen for handoff:show / handoff:hide and never call one of these methods, the tool’s Promise never resolves and the agent will appear to “hang” waiting for the handoff to finish.

Basic event wiring

client.on('handoff:show', (data) => {
  // data contains:
  // - summaryText: string (short description of the conversation so far)
  // - waLink: string | null (pre-filled WhatsApp deeplink, if configured)

  showMyHandoffForm(data); // Render your custom modal or form
});

client.on('handoff:hide', (data) => {
  // data.outcome will be:
  // - 'email_submitted'
  // - 'whatsapp_submitted'
  // - 'cancelled'

  hideMyHandoffForm();
});

You are free to design the form, modal, or flow to match your product while the agent logic and triggers remain configured in Babelbeez.

Complete example: custom handoff form

The following example shows a full email + WhatsApp flow that correctly closes the loop with the agent.

// 1. Listen for the request and show your modal
client.on('handoff:show', (data) => {
  // data contains { summaryText, waLink }
  const modal = document.getElementById('my-handoff-modal');
  const summaryEl = document.getElementById('summary-preview');

  if (summaryEl && data.summaryText) {
    summaryEl.textContent = data.summaryText;
  }

  // Optional: WhatsApp button when waLink is provided
  const waBtn = document.getElementById('my-handoff-wa-button');
  if (waBtn) {
    if (data.waLink) {
      waBtn.style.display = 'inline-flex';
      waBtn.onclick = () => {
        window.open(data.waLink, '_blank', 'noopener');
        // Tell the SDK the user continued via WhatsApp (this also ends the session)
        client.handleHandoffCancel({ viaWhatsapp: true });
      };
    } else {
      waBtn.style.display = 'none';
      waBtn.onclick = null;
    }
  }

  if (modal) {
    modal.style.display = 'block';
  }
});

// 2. Handle the "Submit" action in your UI
document.getElementById('my-submit-button')?.addEventListener('click', async () => {
  const emailInput = document.getElementById('user-email-input');
  const consentCheckbox = document.getElementById('consent-checkbox');

  const email = emailInput && 'value' in emailInput ? emailInput.value.trim() : '';
  const consent = !!(consentCheckbox && 'checked' in consentCheckbox && consentCheckbox.checked);

  if (!email || !consent) {
    // Show your own validation error UI
    return;
  }

  // IMPORTANT: This resolves the request_human_handoff tool
  // and lets the agent continue the conversation.
  await client.handleHandoffSubmit({ email, consent });
});

// 3. Handle "Cancel" or closing the modal
document.getElementById('my-cancel-button')?.addEventListener('click', () => {
  // Tells the agent the user declined for now; the agent can ask
  // if there is anything else they can help with.
  client.handleHandoffCancel();
});

// 4. Listen for cleanup from the SDK and hide your UI
client.on('handoff:hide', (data) => {
  const modal = document.getElementById('my-handoff-modal');
  if (modal) {
    modal.style.display = 'none';
  }

  // data.outcome is 'email_submitted', 'whatsapp_submitted', or 'cancelled'
  console.log('Handoff finished with outcome:', data?.outcome);
});

This mirrors the behavior of the official headless SDK demo, ensuring the underlying request_human_handoff tool completes and the agent doesn’t get stuck waiting.

Full API reference

Properties

Property	Type	Description
`isActive`	boolean	Returns `true` if the voice session is currently active and connected.
`publicChatbotId`	string	The unique identifier for the chatbot instance.

Methods

Method	Description
`initialize()`	Prepares the audio context and fetches fresh agent configuration/tokens.
`startChat()`	Establishes the WebSocket connection to OpenAI.
`endChat(reason?)`	Ends the current session. Optional `reason` string (default: `'user_action'`).
`sendUserText(text)`	Sends a text message to the agent (processed as user input).
`setMuted(muted)`	Mutes (`true`) or unmutes (`false`) the user’s microphone.
`handleHandoffSubmit({ email, consent })`	Called when the user submits your handoff form. Sends contact details to your backend and resolves the agent’s `request_human_handoff` tool as `email_submitted`.
`handleHandoffCancel({ viaWhatsapp? })`	Called when the user cancels the form or chooses WhatsApp. Resolves the handoff tool as `'cancelled'` or `'whatsapp_submitted'` and, in the WhatsApp case, ends the session.

Events

Event	Payload	Description
`buttonState`	`string`	The current functional state (`idle`, `loading`, `active`, `speaking`, `rag-retrieval`, `error`).
`status`	`{ text, isActive }`	Human-readable status text and boolean active state.
`error`	`{ message, severity }`	Emitted on connection errors or microphone denial.
`transcript`	`{ text, isFinal, role }`	Real-time transcript updates for user or assistant.
`handoff:show`	`{ summaryText, waLink }`	Triggered when the agent initiates a human handoff flow. `summaryText` is a short recap of the conversation; `waLink` is a pre-filled WhatsApp deeplink when configured.
`handoff:hide`	`{ outcome }`	Triggered when the handoff flow is completed or cancelled. `outcome` is `'email_submitted'`, `'whatsapp_submitted'`, or `'cancelled'`.
`session:start`	`undefined`	Emitted when the WebSocket connection is successfully established.
`session:end`	`undefined`	Emitted when the session ends (user hangs up, error, or agent ends it).

With this Headless SDK, you can integrate Babelbeez deeply into your own components and design system while leaving the real-time audio and AI orchestration to us.

Headless embed: use your own chat button ​

Installation ​

Initialization ​

Usage flow ​

1. Listen to state changes ​

2. Start and end sessions ​

Example: Custom ringtone ​

Example: Displaying transcripts ​

Hybrid text & voice ​

Handling handoffs ​

Basic event wiring ​

Complete example: custom handoff form ​

Full API reference ​

Properties ​

Methods ​

Events ​