Skip to main content

Agent Session

The AgentSession is the central orchestrator that integrates the Agent, Pipeline, and optional ConversationFlow into a cohesive workflow. It manages the complete lifecycle of an agent's interaction within a VideoSDK meeting, handling initialization, execution, and cleanup.

Agent Session

Core Features

  • Component Orchestration: Unifies agent, pipeline, and conversation flow components.
  • Lifecycle Management: Handles session start, execution, and cleanup

Constructor Parameters

AgentSession(
agent: Agent,
pipeline: Pipeline,
conversation_flow: Optional[ConversationFlow] = None,
wake_up: Optional[int] = None
)

Wake-Up Call

Wake-up call automatically triggers actions when users are inactive for a specified period of time, helping maintain engagement.

main.py
# Configure wake-up timer  
session = AgentSession(
agent=MyAgent(),
pipeline=pipeline,
wake_up=10 # Trigger after 10 seconds of inactivity
)

# Set callback function
async def on_wake_up():
await session.say("Are you still there? How can I help?")

session.on_wake_up = on_wake_up
note

Important: If a wake_up time is provided, you must set a callback function before starting the session. If no wake_up time is specified, no timer or callback will be activated.

Basic Usage

To get an agent running, you initialize an AgentSession with your custom Agent and a configured Pipeline. The session handles the underlying connection and data flow.

Example Implementation:

main.py
from videosdk.agents import AgentSession, Agent, WorkerJob, JobContext, RoomOptions
from videosdk.plugins.openai import OpenAIRealtime
from videosdk.agents import RealTimePipeline

class MyAgent(Agent):
def __init__(self):
super().__init__(instructions="You are a helpful meeting assistant.")

async def on_enter(self):
await self.session.say("Hello! How can I help you today?")

async def start_session(ctx: JobContext):
model = OpenAIRealtime(model="gpt-4o-realtime-preview")
pipeline = RealTimePipeline(model=model)

session = AgentSession(
agent=MyAgent(),
pipeline=pipeline
)

await ctx.connect()
await session.start()
# Session runs until manually stopped or meeting ends

def make_context():
return JobContext(
room_options=RoomOptions(
room_id="your-room-id",
auth_token="your-auth-token",
name="Assistant Bot"
)
)

if __name__ == "__main__":
job = WorkerJob(entrypoint=start_session, jobctx=make_context)
job.start()

Development and Testing Features

The AgentSession supports several modes for development, testing, and user engagement:

Playground Mode

Playground mode provides a web-based interface for testing your agent without building a separate client application.

Usage

To activate playground mode, simply set playground: True in your RoomOptions for JobContext.

main.py
from videosdk.agents import RoomOptions, JobContext, WorkerJob

async def entrypoint(ctx: JobContext):
# Your agent implementation here
# This is where you create your pipeline, agent, and session
pass

def make_context() -> JobContext:
room_options = RoomOptions(
room_id="<meeting_id>",
name="Test Agent",
playground=True # Enable playground mode
)

return JobContext(room_options=room_options)

if __name__ == "__main__":
from videosdk.agents import WorkerJob
job = WorkerJob(entrypoint=entrypoint, jobctx=make_context)
job.start()

When enabled, the playground URL is automatically displayed in your terminal for easy access.

note

Note: Playground mode is designed for development and testing purposes. For production deployments, ensure playground mode is disabled to maintain security and performance.

Console Mode

Console mode allows you to test your agent directly in the terminal using your microphone and speakers, without joining a VideoSDK meeting.

Usage

To use console mode, simply add the console argument when running your agent script:

python main.py console

The console will display:

  • Agent speech output
  • User speech input
  • Various latency metrics (STT, TTS, LLM,EOU)
  • Pipeline processing information

This flexibility allows you to use the same agent code for both development and production environments.

Session Lifecycle Management

The AgentSession provides methods to control the agent's presence and behavior in the meeting.

Example of Managing the Lifecycle:

main.py
import asyncio
from videosdk.agents import AgentSession, Agent, WorkerJob, JobContext, RoomOptions
from videosdk.plugins.openai import OpenAIRealtime
from videosdk.agents import RealTimePipeline

class MyAgent(Agent):
def __init__(self):
super().__init__(instructions="You are a helpful meeting assistant.")

# LIFECYCLE: Agent entry point - called when session starts
async def on_enter(self):
await self.session.say("Hello! How can I help you today?")

# LIFECYCLE: Agent exit point - called when session ends
async def on_exit(self):
print("Agent is leaving the session")

async def run_agent_session(ctx: JobContext):
# LIFECYCLE STAGE 1: Session Creation
model = OpenAIRealtime(model="gpt-4o-realtime-preview")
pipeline = RealTimePipeline(model=model)

session = AgentSession(agent=MyAgent(), pipeline=pipeline)

try:
# LIFECYCLE STAGE 2: Connection Establishment
await ctx.connect()

# LIFECYCLE STAGE 3: Session Start
await session.start()

# LIFECYCLE STAGE 4: Session Running
await asyncio.Event().wait()

finally:
# LIFECYCLE STAGE 5: Session Cleanup
await session.close()

# LIFECYCLE STAGE 6: Context Shutdown
await ctx.shutdown()

# LIFECYCLE STAGE 0: Context Creation
def make_context() -> JobContext:
room_options = RoomOptions(room_id="your-room-id", auth_token="your-token")
return JobContext(room_options=room_options)

if __name__ == "__main__":
# LIFECYCLE ORCHESTRATION: Worker Job Management
# Creates and starts the worker job that manages the entire lifecycle
job = WorkerJob(entrypoint=run_agent_session, jobctx=make_context)
job.start()

Examples - Try Out Yourself

We have examples to get you started. Go ahead, try out, talk to agent, understand and customize according to your needs.

Got a Question? Ask us on discord