Cartesia TTS

The Cartesia TTS provider enables your agent to use Cartesia's high-quality, low-latency text-to-speech models for generating natural-sounding voice output.

Installation

Install the Cartesia-enabled VideoSDK Agents package:

pip install "videosdk-plugins-cartesia"

Importing

from videosdk.plugins.cartesia import CartesiaTTS

Authentication

The Cartesia plugin requires a Cartesia API key.

Set CARTESIA_API_KEY in your .env file.

Example Usage

from videosdk.plugins.cartesia import CartesiaTTS
from videosdk.agents import CascadingPipeline

# Initialize the Cartesia TTS model
tts = CartesiaTTS(
    # When CARTESIA_API_KEY is set in .env - DON'T pass api_key parameter
    api_key="your-cartesia-api-key",
    model="sonic-2",
    voice_id="794f9389-aac1-45b6-b726-9d9369183238",
    language="en"
)

# Add tts to cascading pipeline
pipeline = CascadingPipeline(tts=tts)

note

When using .env file for credentials, don't pass them as arguments to model instances. The SDK automatically reads environment variables, so omit api_key and other credential parameters from your code.

Configuration Options

api_key: (str) Your Cartesia API key. Can also be set via the CARTESIA_API_KEY environment variable.
model: (str) The Cartesia TTS model to use (e.g., "sonic-2", "sonic-turbo"). Defaults to "sonic-2".
voice_id: (str) The ID of the voice to use for generating speech.
language: (str) The language of the voice (e.g., "en", "fr"). Defaults to "en".

Additional resources

The following resources provide more information about using Cartesia with VideoSDK Agents.

Cartesia docs: Cartesia TTS docs.

SDK Reference

GitHub Repository

Python Package

Got a Question? Ask us on discord

Installation​

Importing​

Authentication​

Example Usage​

Configuration Options​

Additional resources​