Skip to content

Pipecat Voice Pipeline

Pipecat is a Python framework that orchestrates STT, TTS, and LLM services. It supports Ollama, Whisper, and Piper as first-class services.

What Pipecat Gives You

A pipeline framework: audio capture → STT → LLM → TTS → playback
Service adapters for Ollama, Whisper, and Piper — minimal custom glue needed
Quickstart examples to test a full voice loop in minutes

Mapping to the Homelab

Ollama — Keep running bare-metal. Point Pipecat to http://localhost:11434.
Whisper (Docker) — Expose STT endpoint to Pipecat via host networking or published port.
Piper (Docker) — Wire container so Pipecat can post text and receive audio stream.
Pipecat server — Install on same host, configure service URIs, run quickstart to validate.

Minimal Wiring Checklist

pip install "pipecat-ai[ollama,whisper,piper]"
Confirm Ollama is reachable: ollama ps
Start Whisper and Piper containers; verify health endpoints
Configure services: { llm: ollama://..., stt: whisper://..., tts: piper://... } and run quickstart

Practical Notes

Networking — Run Pipecat and Ollama on the same host; use host networking for containers.
GPU — Verify NVIDIA drivers and Ollama GPU config for Ollama and Faster-Whisper.
Audio formats — Ensure Whisper container accepts Pipecat’s sample rate/format.
Security — Bind services to localhost; firewall external access.

Resources