Create Voice Agent

The Create Voice Agent API allows you to create and configure AI voice agents with comprehensive settings including voice configuration, speech-to-text, LLM selection, and advanced call handling features.

API Endpoint

POST /create-agent Content-Type: application/json Authentication: Required (Token parameter)

Request Body

{
  "agent_name": "Customer Support Agent",
  "description": "Handles customer inquiries and support requests",
  "prompt": "You are a helpful customer support agent. Answer questions politely and professionally.",
  "timezone": "America/New_York",
  "greeting": "Hello! Thank you for calling. How can I assist you today?",
  "session_data_webhook": "https://www.tryunleashx.com/webhooks/session-data",
  "voice": {
    "provider": "elevenlabs",
    "voice_id": "RXe6OFmxoC0nlSWpuCDy",
    "model": "eleven_turbo_v2_5",
    "settings": {
      "stability": 0.5,
      "voice_style": 1,
      "speed": 1.0,
      "speaker_boost": true,
      "similarity_boost": 0.75,
      "tone": "professional",
      "style": "classic",
      "instruction_sensitivity": "medium"
    }
  },
  "speech_to_text": {
    "provider": "deepgram",
    "model": "nova-2",
    "language": "english"
  },
  "llm": {
    "llm": "gpt-4o",
    "model": "gpt-4o"
  },
  "configurations": {
    "confidence_threshold": 0.8,
    "do_not_call_detection": true,
    "agent_terminate_call": {
      "enabled": true,
      "instruction": "End the call politely when the conversation is complete",
      "message": "Thank you for calling. Have a great day!"
    },
    "inactivity_handling": {
      "enabled": true,
      "idle_time": 30,
      "message": "Are you still there? Let me know if you need any help."
    },
    "interruption": {
      "enabled": true,
      "value": 3
    },
    "voicemail": {
      "enabled": true,
      "message": "Hello, this is a message from Customer Support. Please call us back at your convenience."
    }
  }
}

Required Fields

Field	Type	Description
`agent_name`	string	Name of the voice agent (required)
`prompt`	string	System prompt/instructions that define the agent’s behavior and personality (required)

The voice object is optional — include it to configure TTS provider, voice ID, and voice settings.

Optional Fields

Basic Information

Field	Type	Description	Default
`description`	string	Description of the agent’s purpose	Empty string
`timezone`	string	Timezone for the agent (e.g., “America/New_York”, “Europe/London”)	UTC
`greeting`	string	The agent’s first message when the call starts	None
`session_data_webhook`	string	Webhook URL to receive end-of-session data	None

Voice Configuration

The voice object is optional and, if provided, contains the following properties:

Property	Type	Required	Description
`provider`	string	No	Voice provider: `elevenlabs`, `openai`, `deepgram`, `sarvam`
`voice_id`	string	Yes	Unique identifier for the voice
`model`	string	No	TTS model to use (see Voice Models below)
`settings`	object	No	Voice settings configuration (see Voice Settings below)

Voice Providers

Provider	Value	Description
ElevenLabs	`elevenlabs`	High-quality AI voice synthesis with natural-sounding voices and emotional range
OpenAI	`openai`	Advanced text-to-speech with multiple voice options
Deepgram	`deepgram`	Real-time speech recognition and voice synthesis
Sarvam	`sarvam`	Multilingual voice synthesis optimized for Indian languages

Voice Models

ElevenLabs Models

Model	Value	Description
Turbo v2.5	`eleven_turbo_v2_5`	Latest high-speed model with low latency (Recommended)
Multilingual v2	`eleven_multilingual_v2`	High-quality multilingual voice synthesis
Monolingual v1	`eleven_monolingual_v1`	English-only optimized model

OpenAI Models

Model	Value	Description
TTS 1	`tts-1`	Standard quality, faster generation
TTS 1 HD	`tts-1-hd`	High definition, better quality

Voice Settings

The settings object contains fine-tuning parameters for voice output:

Property	Type	Range	Description	Default
`stability`	number	0.0 - 1.0	Controls voice consistency. Higher = more stable, Lower = more expressive	0.5
`voice_style`	number	0 - 100	Style intensity for the voice	0
`speed`	number	0.5 - 2.0	Speech speed multiplier	1.0
`speaker_boost`	boolean	true/false	Enhances speaker characteristics	true
`similarity_boost`	number	0.0 - 1.0	How closely to match original voice	0.75
`tone`	string	-	Voice tone: `professional`, `friendly`, `neutral`, `enthusiastic`	None
`style`	string	-	Speaking style: `classic`, `conversational`, `narrative`	`classic`
`instruction_sensitivity`	string	-	How strictly to follow instructions: `low`, `medium`, `high`	`medium`

Speech-to-Text Configuration

The speech_to_text object configures the transcription service. Use full language names (not codes) for the language field — for example english, hindi, multi, spanish, etc. Supported values include:

english, hindi, multi, albanian, arabic, armenian, azerbaijani, belarusian, bengali, bosnian, bulgarian, catalan, chinese, croatian, czech, danish, dutch, english_australia, english_india, english_new_zealand, english_uk, english_us, english_spanish, estonian, finnish, french, galician, georgian, german, german_switzerland, greek, gujarati, haitian_creole, hausa, hebrew, afrikaans, hungarian, icelandic, indonesian, italian, japanese, javanese, kannada, kazakh, khmer, korean, latvian, lithuanian, macedonian, malay, malayalam, maori, marathi, nepali, norwegian, persian, polish, portuguese, portuguese_brazil, punjabi, romanian, russian, serbian, shona, slovak, slovenian, somali, spanish, spanish_latin_america, sundanese, swahili, swedish, tagalog, tamil, tajik, telugu, thai, tswana, turkish, ukrainian, urdu, vietnamese, welsh.

The speech_to_text object configures the transcription service:

Property	Type	Required	Description
`provider`	string	Yes	STT provider (see providers below)
`model`	string	Yes	Model to use (see models below)
`language`	string	Yes	Language name (see languages above)

STT Providers and Models

Deepgram (Provider: `deepgram`)

Model	Value	Description	Use Case
Nova 2	`nova-2`	General purpose model	Default choice for most use cases
Nova 2 General	`nova-2-general`	General purpose transcription	Versatile transcription
Nova 2 Meeting	`nova-2-meeting`	Optimized for meetings	Conference calls, meetings
Nova 2 Phone Call	`nova-2-phonecall`	Optimized for phone calls	Phone conversations (Recommended)
Nova 2 Finance	`nova-2-finance`	Optimized for finance	Banking, financial services
Nova 2 Conversational AI	`nova-2-conversationalai`	Optimized for conversational AI	AI assistants, chatbots
Nova 2 Video	`nova-2-video`	Optimized for video	Video content transcription
Nova 2 Medical	`nova-2-medical`	Optimized for medical	Healthcare conversations
Nova 2 Drivethru	`nova-2-drivethru`	Optimized for drive-thru	Drive-thru scenarios
Nova 2 Automotive	`nova-2-automotive`	Optimized for automotive	Car environments
Nova 2 Legal	`nova-2-legal`	Optimized for legal	Legal conversations
Nova 2 Government	`nova-2-government`	Optimized for government	Government services
Nova 2 Enterprise	`nova-2-enterprise`	Optimized for enterprise	Enterprise applications
Nova 3	`nova-3`	Latest general purpose model	Most accurate, latest technology

Gladia (Provider: `gladia`)

Model	Value	Description
Gladia	`gladia`	High-accuracy multilingual transcription

Sarvam (Provider: `sarvam`)

Model	Value	Description
Sarvam	`sarvam`	Optimized for Indian languages

LLM Configuration

The llm object configures the language model:

Property	Type	Required	Description
`llm`	string	Yes	LLM provider and model (see options below)
`model`	string	Yes	Model name (typically same as `llm`)

Available LLM Models

OpenAI Models

Model	Value	Description	Use Case
GPT-4o	`gpt-4o`	Most capable model, multimodal	Complex reasoning, best quality (Recommended)
GPT-4o Mini	`gpt-4o-mini`	Smaller, faster, cost-effective	Fast responses, simpler tasks
GPT-4 Turbo	`gpt-4-turbo`	High performance GPT-4	Advanced reasoning
GPT-4.1	`gpt-4.1`	Latest GPT-4 variant	Enhanced capabilities
GPT-4.1 Mini	`gpt-4.1-mini`	Compact GPT-4.1	Efficient processing
GPT-4.1 Nano	`gpt-4.1-nano`	Ultra-fast GPT-4.1	Ultra-low latency
GPT-3.5 Turbo	`gpt-3.5-turbo`	Fast and cost-effective	Simple conversations

OpenAI Realtime Models

Model	Value	Description
GPT-4o Realtime	`gpt-4o-realtime-preview`	Real-time audio processing
GPT-4o Mini Realtime	`gpt-4o-mini-realtime-preview`	Faster real-time processing

Meta LLaMA Models

Model	Value	Description	Use Case
LLaMA 3.1 405B	`llama-3-1-405b`	Largest, most capable	Complex tasks, high accuracy
LLaMA 3.1 70B	`llama-3-1-70b`	Balanced performance	Good quality, reasonable speed
LLaMA 3.1 8B	`llama-3-1-8b`	Fast and efficient	Quick responses
LLaMA 3 70B	`llama-3-70b`	Previous generation	Reliable performance

Mistral Models

Model	Value	Description
Mistral Large 2407	`mistral-large-2407`	High-performance European model

Other Models

Model	Value	Description
L3.1 70B Euryale v2.2	`l3.1-70b-euryale-v2.2`	Fine-tuned LLaMA variant
DeepSeek v3	`deepseek-v3`	Advanced reasoning model

Configurations

The configurations object contains advanced call handling settings:

Confidence Threshold

Property	Type	Range	Description	Default
`confidence_threshold`	number	0.0 - 1.0	Minimum confidence for speech recognition	0.8

Do Not Call Detection

Property	Type	Description	Default
`do_not_call_detection`	boolean	Detect and respect “do not call” indicators	false

Agent Terminate Call

Configuration for when the agent can end calls autonomously:

Property	Type	Description	Default
`enabled`	boolean	Allow agent to terminate calls	false
`instruction`	string	Instructions for when to end calls	None
`message`	string	Message to say before ending call	None

Example:

{
  "enabled": true,
  "instruction": "End the call when the customer says goodbye or has no more questions",
  "message": "Thank you for calling. Have a great day!"
}

Inactivity Handling

Configuration for handling user inactivity:

Property	Type	Description	Default
`enabled`	boolean	Enable inactivity detection	false
`idle_time`	number	Seconds of silence before prompting (5-120)	30
`message`	string	Message to say after idle time	None

Example:

{
  "enabled": true,
  "idle_time": 30,
  "message": "Are you still there? Let me know if you need any help."
}

Interruption Settings

Configuration for handling user interruptions:

Property	Type	Description	Default
`enabled`	boolean	Allow users to interrupt the agent	true
`value`	number	Interruption sensitivity (1-5, higher = more sensitive)	3

Sensitivity Levels:

1 - Very low (agent rarely gets interrupted)
2 - Low
3 - Medium (Recommended)
4 - High
5 - Very high (agent easily interrupted)

Voicemail Handling

Configuration for voicemail detection and handling:

Property	Type	Description	Default
`enabled`	boolean	Enable voicemail detection	false
`message`	string	Message to leave if voicemail detected	None

Example:

{
  "enabled": true,
  "message": "Hello, this is Customer Support calling. Please call us back at 1-800-123-4567. Thank you!"
}

Response

Success Response

Status Code: 200 OK

{
  "id": "agent_abc123xyz",
  "agent_name": "Customer Support Agent",
  "config": {
    "prompt": "You are a helpful customer support agent...",
    "voice": {
      "provider": "elevenlabs",
      "voice_id": "RXe6OFmxoC0nlSWpuCDy",
      "model": "eleven_turbo_v2_5"
    },
    "speech_to_text": {
      "provider": "deepgram",
      "model": "nova-2",
      "language": "english"
    },
    "llm": {
      "llm": "gpt-4o",
      "model": "gpt-4o"
    }
  },
  "created_at": 1706745600
}

Error Responses

400 - Bad Request

{
  "detail": "Invalid request body. Missing required field: agent_name"
}

Common causes:

Missing required fields (agent_name, prompt, or voice)
Invalid data types
Invalid provider or model values

401 - Unauthorized

{
  "detail": "Invalid authentication credentials"
}

Common causes:

Missing authorization header or token parameter
Invalid or expired API key
Insufficient permissions

422 - Validation Error

{
  "detail": [
    {
      "loc": ["body", "voice", "provider"],
      "msg": "Invalid voice provider. Must be one of: elevenlabs, openai, deepgram, sarvam",
      "type": "value_error"
    }
  ]
}

Common causes:

Invalid enum values (provider, model names)
Out of range values (stability, speed, confidence_threshold)
Invalid format (timezone, language codes)

500 - Internal Server Error

{
  "detail": "Internal server error"
}

Example Requests

Minimal Request

curl -X POST https://api.yourdomain.com/create-agent \
  -H "Content-Type: application/json" \
  -H "token: your_api_key_here" \
  -d '{
    "agent_name": "Simple Agent",
    "prompt": "You are a helpful assistant.",
    "voice": {
      "provider": "elevenlabs",
      "voice_id": "RXe6OFmxoC0nlSWpuCDy"
    }
  }'

Complete Request with All Features

curl -X POST https://api.yourdomain.com/create-agent \
  -H "Content-Type: application/json" \
  -H "token: your_api_key_here" \
  -d '{
    "agent_name": "Advanced Support Agent",
    "description": "Full-featured customer support agent",
    "prompt": "You are an experienced customer support agent. Be helpful, professional, and empathetic.",
    "timezone": "America/New_York",
    "greeting": "Hello! Thank you for calling. How can I help you today?",
    "session_data_webhook": "https://www.tryunleashx.com/webhooks/session-data",
    "voice": {
      "provider": "elevenlabs",
      "voice_id": "RXe6OFmxoC0nlSWpuCDy",
      "model": "eleven_turbo_v2_5",
      "settings": {
        "stability": 0.5,
        "voice_style": 1,
        "speed": 1.0,
        "speaker_boost": true,
        "similarity_boost": 0.75,
        "tone": "professional",
        "style": "conversational",
        "instruction_sensitivity": "medium"
      }
    },
    "speech_to_text": {
      "provider": "deepgram",
      "model": "nova-2-phonecall",
      "language": "english"
    },
    "llm": {
      "llm": "gpt-4o",
      "model": "gpt-4o"
    },
    "configurations": {
      "confidence_threshold": 0.8,
      "do_not_call_detection": true,
      "agent_terminate_call": {
        "enabled": true,
        "instruction": "End call politely when conversation is complete",
        "message": "Thank you for calling. Have a great day!"
      },
      "inactivity_handling": {
        "enabled": true,
        "idle_time": 30,
        "message": "Are you still there? Let me know if you need help."
      },
      "interruption": {
        "enabled": true,
        "value": 3
      },
      "voicemail": {
        "enabled": true,
        "message": "Hello, this is Customer Support. Please call us back. Thank you!"
      }
    }
  }'

Important Notes

Required Fields: Only agent_name and prompt are required. The voice object is optional — include voice (with provider and voice_id) when you want to configure TTS for the agent. All other fields are optional.
Voice IDs: Get available voice IDs from the List Voices API.
Webhooks: If you provide a session_data_webhook, ensure your endpoint can handle POST requests with session data.
Timezones: Use standard timezone strings (e.g., “America/New_York”, “Europe/London”, “Asia/Tokyo”).
Language Names: Use full language names (e.g., english, hindi, spanish) or region-specific variants (e.g., english_us, english_uk) as shown in the Speech-to-Text section above.
Model Compatibility: Ensure the voice model is compatible with your chosen provider. For example, eleven_turbo_v2_5 only works with ElevenLabs.
Rate Limits: API calls are subject to rate limiting based on your plan. See pricing documentation for details.
Testing: After creating an agent, test it thoroughly before using in production. Use the Make Call API to test your agent.
Attaching phone numbers is necessary to place calls via agents

List Voices - Get available voice IDs
Update Voice Agent - Modify agent settings
List Voice Agents - View all agents
Delete Voice Agent - Remove an agent
Make Call - Test your agent with a call

API Documentation

Phone Numbers

Voice AI

Call Logs

​API Endpoint

​Request Body

​Required Fields

​Optional Fields

​Basic Information

​Voice Configuration

​Voice Providers

​Voice Models

​ElevenLabs Models

​OpenAI Models

​Voice Settings

​Speech-to-Text Configuration

​STT Providers and Models

​Deepgram (Provider: deepgram)

​Gladia (Provider: gladia)

​Sarvam (Provider: sarvam)

​LLM Configuration

​Available LLM Models

​OpenAI Models

​OpenAI Realtime Models

​Meta LLaMA Models

​Mistral Models

​Other Models

​Configurations

​Confidence Threshold

​Do Not Call Detection

​Agent Terminate Call

​Inactivity Handling

​Interruption Settings

​Voicemail Handling

​Response

​Success Response

​Error Responses

​400 - Bad Request

​401 - Unauthorized

​422 - Validation Error

​500 - Internal Server Error

​Example Requests

​Minimal Request

​Complete Request with All Features

​Important Notes

​Related Endpoints

API Endpoint

Request Body

Required Fields

Optional Fields

Basic Information

Voice Configuration

Voice Providers

Voice Models

ElevenLabs Models

OpenAI Models

Voice Settings

Speech-to-Text Configuration

STT Providers and Models

Deepgram (Provider: `deepgram`)

Gladia (Provider: `gladia`)

Sarvam (Provider: `sarvam`)

LLM Configuration

Available LLM Models

OpenAI Models

OpenAI Realtime Models

Meta LLaMA Models

Mistral Models

Other Models

Configurations

Confidence Threshold

Do Not Call Detection

Agent Terminate Call

Inactivity Handling

Interruption Settings

Voicemail Handling

Response

Success Response

Error Responses

400 - Bad Request

401 - Unauthorized

422 - Validation Error

500 - Internal Server Error

Example Requests

Minimal Request

Complete Request with All Features

Important Notes

Related Endpoints