
Prototyping Voice Agents with LiveKit in 2025
Our Prototype Development capability thrives on fast feedback. In 2025 the LiveKit ecosystem — Agents SDK, third-party voice partners, and tooling around OpenAI’s Realtime API — makes it possible to stand up expressive, low-latency voice experiences in days.
Architecture at a Glance
Caller ↔︎ LiveKit Room ↔︎ Agent Runtime ↔︎ LLM + Voice Services ↔︎ Internal APIs
- LiveKit Agents let Node.js or Python services join rooms as first-class participants, streaming audio, video, and metadata in real time. (Docs)
- Speechify partnership (May 2025) delivers >1,000 voices across 60 languages with pricing that scales ($10 per million characters ≈ 2,000 minutes). (Speechify)
- PlayAI + LiveKit (March 2025) brings ultra-emotive dialog models routed through LiveKit Agents, perfect for concierge-style prototypes. (Play.ht)
- OpenAI Realtime API provides multimodal grounding — the agent can “see” or “click” in addition to talking — while LiveKit handles session orchestration.
Prototype Roadmap
Day 0: Scope + Guardrails
- Define the outcome metric (conversion, CSAT, qualified lead rate).
- Validate data access: CRM notes, knowledge base, escalation workflows.
- Draft failure policies (human handoff triggers, redaction rules).
Day 1-2: Skeleton Build
- Spin up a LiveKit project and configure the Agents runtime.
- Scaffold an Interaction Orchestrator that:
- Joins a room.
- Performs speech-to-text (OpenAI Whisper, ElevenLabs, or Deepgram).
- Publishes synthesized speech back through Speechify or PlayAI voices.
- Implement tool stubs for CRM lookup, scheduling, and follow-up tasks.
Day 3-4: Intent + Memory
- Add vector + graph retrieval (CAPRAG-inspired) for contextual answers.
- Persist short-term memory per caller: intent, sentiment, key objections.
- Script “golden path” demos to validate latency, voice quality, and handoff.
Day 5-7: Pilot Hardening
- Instrument metrics: latency, interruption handling, first-contact resolution.
- Layer in consent prompts, profanity filters, and audit logging.
- Conduct operator ride-alongs; capture friction for fast iteration.
Starter Code Snippets
Agent bootstrap
import { Agent, RoomServiceClient } from "@livekit/agents-node";
import { createRealtimeClient } from "@openai/realtime-api";
export const startAgent = async () => {
const roomClient = new RoomServiceClient({
apiKey: process.env.LIVEKIT_API_KEY ?? "",
apiSecret: process.env.LIVEKIT_API_SECRET ?? "",
url: process.env.LIVEKIT_URL ?? "",
});
const agent = new Agent({ roomClient });
agent.onParticipantConnected(async (participant) => {
const realtimeClient = await createRealtimeClient({
apiKey: process.env.OPENAI_API_KEY ?? "",
baseUrl: "https://api.openai.com/v1/realtime",
});
// Wire transcription + synthesis
agent.pipeAudio(participant, realtimeClient);
});
await agent.join({
roomName: "prototype-support-line",
identity: "voice-agent",
metadata: { persona: "concierge" },
});
};
Tool invocation pattern
type FollowUpTask = {
contactId: string;
summary: string;
dueAt: Date;
};
export const scheduleFollowUp = async (task: FollowUpTask) => {
// Call internal API with auditable payload
};
Demo Checklist
- Low latency: Target < 300 ms from user utterance to agent response.
- Emotionally aware voices: Select Speechify or PlayAI voices that match the brand tone.
- Turn detection: Ensure the agent waits for natural pauses before interrupting.
- Escalation ready: Human agent can join the LiveKit room instantly with full transcript context.
Proving Value in Week One
- Run 20+ controlled calls and benchmark against human-handled metrics.
- Capture qualitative feedback from operators and customers.
- Present latency, containment, and satisfaction dashboards to stakeholders.
- Secure the runway for “Production Alpha” by aligning on data, observability, and compliance gaps.
What Comes After the Prototype
- Transition the codebase into the engineering runway with automated testing, feature flags, and infrastructure-as-code.
- Expand language and channel coverage — LiveKit handles video, screen share, and data tracks out of the box.
- Start training playbooks for marketing, sales, and support teams so they can design new flows on top of the voice agent.
© 2026 Petrus Strategy LLC.