Releases: livekit/agents
livekit-agents@1.5.15
What's Changed
- (openai realtime): add status_details to incomplete response logs by @tinalenguyen in #5873
- feat(cartesia): add ink-2 stt by @charlotte-zhuang in #5827
- fix(ipc): bound AgentSession.aclose() during job shutdown by @longcw in #5875
- fix(amd): defer no-speech timer until SIP call is answered by @chenghao-mou in #5848
- chore(tests): update tests to use Inference whenever possible AGT-2304 by @chenghao-mou in #5632
- fix(llm): sort function tools to make order an invariant by @u9g in #5884
- fix(deps): relax bithuman pin to <3 for SDK 2.x compatibility by @sgu-bithuman in #5882
- fix(aws): flatten tool blocks when toolConfig is omitted by @u9g in #5850
- fix(llm): serialize all provider tools for the Responses API + log server-side tool execution by @u9g in #5865
- fix(llm): make to_responses_fnc_ctx.provider_tool_type optional by @toubatbrian in #5892
- internal(voice): wire DebugMessage over remote-session wire by @toubatbrian in #5855
- Add respeecher tts plugin by @mitrushchienkova in #3233
- feat(plugins-google): add cached_content option for explicit context caching by @kamil-bidus in #5675
- (google llm): ruff and add cache warnings by @tinalenguyen in #5893
- feat(smallestai): update TTS plugin for Lightning v3.1 Pro and WebSocket streaming by @harshitajain165 in #5799
- fix(openai realtime): honor OPENAI_BASE_URL env var fallback by @chenghao-mou in #5895
- livekit-agents@1.5.15 by @github-actions[bot] in #5896
New Contributors
- @sgu-bithuman made their first contribution in #5882
- @mitrushchienkova made their first contribution in #3233
- @kamil-bidus made their first contribution in #5675
Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.5.14...livekit-agents@1.5.15
livekit-agents@1.5.14
What's Changed
- (openai realtime): add gpt-realtime-2 model str by @tinalenguyen in #5791
- (healthcare example): remove filesearch and pdf by @tinalenguyen in #5795
- chore(worker): update worker warnings AGT-2909 by @chenghao-mou in #5771
- fix(voice): race between flush and clear_buffer on interrupt leaks unplayed transcript by @longcw in #5798
- fix(vad): add support for vad reset directly without stream close by @chenghao-mou in #5687
- feat(inworld tts): add delivery_mode parameter for inworld-tts-2 by @chrisackermann in #5801
- fix(voice): wait for end-of-turn task when waiting for user inactivity by @longcw in #5792
- Adding GnaniAI STT plugin by @Gnani-AI-Mintlify in #5769
- (gnani): fix py.typed and remove changesets by @tinalenguyen in #5816
- fix(voice): propagate ChatMessage.interrupted through proto serializer by @toubatbrian in #5824
- fix(anthropic): recreate stream on retry by @he-yufeng in #5820
- fix(llm): convert per-turn instructions on the very first turn too by @theomonnom in #5828
- feat(background_audio): add fade_in / fade_out to AudioConfig by @theomonnom in #5832
- fix(plugins/sarvam): thread language_probability into SpeechData.confidence by @hashirventhodi in #5830
- fix(lemonslice): bind avatar audio output before the upstream session HTTP call by @theomonnom in #5837
- feat(voice/avatar): kick avatar participant on aclose + wait_for_join helper by @theomonnom in #5836
- feat(examples): add LemonSlice avatar with switchable personas by @theomonnom in #5834
- chore(playground): list frontdesk first by @theomonnom in #5838
- fix: 4xx errors should not be retryable by @davidzhao in #5831
- livekit-agents@1.5.13 by @github-actions[bot] in #5835
- feat(assemblyai): add continuous_partials and interruption_delay streโฆ by @dlange-aai in #5819
- ci(deploy-examples): include avatar in matrix + wire LEMONSLICE_API_KEY by @theomonnom in #5839
- docs: clarify MCP support bullet by @scosemicolon in #5822
- fix(core): replace frame drop with silence frame AGT-2914 by @chenghao-mou in #5815
- feat(plugins-soniox): surface per-run language segments end-to-end by @MSameerAbbas in #5730
- use stt timestamps as last speaking time by @chenghao-mou in #5672
- docs: fix Gemini model version in avatar example README by @detail-app[bot] in #5842
- handle leaked chat-template tokens in function call args by @davidzhao in #5840
- fix(voice): restrict stt pipeline reuse to default stt_node by @longcw in #5803
- feat(google): integration for AI Platform LLMs by @davidzhao in #5843
- fix: return function argument errors (ToolError) to LLM by @davidzhao in #5846
- refactor(llm): unified ToolError contract for tool arg validation by @longcw in #5807
- feat(openai): stream input_audio_transcription delta events by @longcw in #5859
- fix(voice): block on_user_turn_exceeded during agent handoff by @longcw in #5858
- fix(voice): reset user turn tracker on clear_user_turn by @longcw in #5857
- fix(elevenlabs): always close TTS stream context on cancellation by @longcw in #5845
- Update download-files deprecation message by @bcherry in #5781
- fix(llm): preserve Field() constraints on function tool arguments by @theomonnom in #5861
- fix(soniox): surface STT server errors by @he-yufeng in #5864
- Fix ElevenLabs websocket context id handling by @flynn-hamming in #5813
- livekit-agents@1.5.14 by @github-actions[bot] in #5870
New Contributors
- @Gnani-AI-Mintlify made their first contribution in #5769
- @he-yufeng made their first contribution in #5820
- @hashirventhodi made their first contribution in #5830
- @scosemicolon made their first contribution in #5822
- @flynn-hamming made their first contribution in #5813
Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.5.12...livekit-agents@1.5.14
livekit-agents@1.5.12
What's Changed
- feat(examples): publish examples manifest to agents-jukebox by @theomonnom in #5751
- deploy-examples: open a PR (with auto-merge) for the manifest refresh by @theomonnom in #5760
- frontdesk: fix tz crash on deploy + survive failed startup by @theomonnom in #5761
- deploy-examples: fix manifest publishing (\u escapes + --force push) by @theomonnom in #5762
- chore(deps): update actions/setup-python action to v6 by @renovate[bot] in #5759
- deprecate mcp_servers param on Agent and AgentSession by @longcw in #5667
- feat(voice): add UserTurnLimitOptions to interrupt long user speech by @longcw in #5492
- feat(avatar): add AvatarMetrics for join latency and playback latency by @longcw in #5581
- docs(inference): fix stale RPC method names and LLM swap description by @detail-app[bot] in #5764
- (speechmatics stt): update default mode to external by @tinalenguyen in #5765
- docs: fix misquoted greeting punctuation in frontdesk README by @detail-app[bot] in #5746
- fix(cerebras): require openai>=2.16.0 for FinalRequestOptions.content by @u9g in #5773
- livekit-agents@1.5.11 by @github-actions[bot] in #5774
- fix(soniox): plugin reliability fixes by @mihafabcic-soniox in #5770
- (gemini): add gemini-3.5-flash model str by @tinalenguyen in #5775
- Add Perplexity Agent API (Responses) LLM by @jliounis in #5772
- examples(inference): editable system prompt + "Open in Builder" + Sonic-3 / 4.1-mini defaults by @JackNDwyer in #5776
- feat(realtime): support multi-message generation per response by @longcw in #5763
- feat(openai stt): support gpt-realtime-whisper by @longcw in #5779
- feat(rime): add time_scale_factor parameter for arcana, mistv3, and coda by @MaCaki in #5778
- (perplexity responses): update default model by @tinalenguyen in #5780
- fix(interruption): incorrect guard skips true interruptions by @chenghao-mou in #5787
- docs(perplexity): fix stale default model in responses.LLM example by @detail-app[bot] in #5785
- add new bulbul:v3 speaker voices by @amlesh-dev in #5782
- livekit-agents@1.5.12 by @github-actions[bot] in #5789
New Contributors
- @mihafabcic-soniox made their first contribution in #5770
- @JackNDwyer made their first contribution in #5776
- @amlesh-dev made their first contribution in #5782
Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.5.10...livekit-agents@1.5.12
livekit-agents@1.5.10
What's Changed
- fix: surface Deepgram TTS websocket errors by @nightcityblade in #5728
- fix(voice): cancel realtime generation when speech is interrupted by @longcw in #5703
- improve should_discard check by @chenghao-mou in #5676
- (examples revamp): add example walkthroughs by @tinalenguyen in #5613
- chore(deps): update dependency langchain-core to v1.3.3 [security] by @renovate[bot] in #5693
- (inference stt): add speechmatics by @tinalenguyen in #5740
- fix(ipc): run shutdown callbacks when entrypoint raises by @longcw in #5741
- fix(dep): pin bithuman by @chenghao-mou in #5743
- fix(barge-in): suppress session-level barge-in errors by @chenghao-mou in #5727
- feat: support python -m livekit.agents.download for asset fetching by @davidzhao in #5738
- feat(rime): add coda model support by @MaCaki in #5748
- (speechmatics + inference): add VAD by @tinalenguyen in #5750
- inference.LLM: add update_options for live model swaps by @theomonnom in #5757
- fix(deepgram): add per-message recv timeout to TTS WebSocket by @longcw in #5756
- livekit-agents@1.5.10 by @github-actions[bot] in #5758
Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.5.9...livekit-agents@1.5.10
livekit-agents@1.5.9
Introducing Answering Machine Detection
An outbound call can reach a person, voicemail, an IVR menu, or a number that can't accept messages. Answering machine detection (AMD) listens to the start of the call, classifies it with an LLM, and returns a result so your agent can respond appropriately.
Read more about using AMD in our blog.
What's Changed
- fix(fishaudio): prevent TTS generation hang by @davidzhao in #5649
- add comments to agent side and inference side fallback adapters by @tmshapland in #5654
- fix(amd): fix negative zero in amd delay calculation by @chenghao-mou in #5650
- fix(async-toolset): use underscore in synthetic call_id by @longcw in #5656
- fix(console): set mock participant state to active by @longcw in #5645
- Update LemonSlice integration to use wait_playback_start by @jp-lemon in #5655
- Fix the memory leak of Google STT when no audio input. by @jmuk in #5591
- ci(examples): add separate file for workflow by @tinalenguyen in #5651
- fix: wording in missing required argument error by @carschandler in #5659
- chore(amd): add uncertain branch in example by @chenghao-mou in #5658
- docs(agents): document realtime capabilities by @shizhigu in #5598
- fix(elevenlabs): restore chunk_length_schedule in WS init payload by @IanSteno in #5006
- fix(voice): prevent scheduling deadlock when pipeline task crashes by @theomonnom in #5678
- (elevenlabs tts): add apply_language_text_normalization param by @tinalenguyen in #5679
- (gemini llm): add service tier param by @tinalenguyen in #5680
- Emit AgentConfigUpdate in OTLP session logs by @theomonnom in #5601
- fix(core): clean up variables when committing a user turn manually by @chenghao-mou in #5671
- (deepgram stt): add redact param by @tinalenguyen in #5692
- fix(agents): persist _speech_start_time across intra-turn VAD bursts by @AlessandroElyos in #5585
- fix: pass API key via header in Neuphonic and Murf WebSocket TTS by @u9g in #5691
- fix(core): make default user span time explicit by @chenghao-mou in #5699
- require CanManageAgentSession grant for remote session by @theomonnom in #5487
- test(examples/survey): add TaskGroup testing reference (DOCS-1225) by @kath0la in #5557
- fix(anthropic): raise default httpx read timeout for streaming; add configurable timeout param by @SuperMarioYL in #5529
- fix(examples): consistent CSV schema in survey_agent by @u9g in #5689
- Adding fine-grained VAD params to Sarvam saaras:v3 STT plugin by @dhruvladia-sarvam in #5563
- Update codec config, MIME mapping, and interruption handling by @dhruvladia-sarvam in #5561
- (gradium): update endpoint by @tinalenguyen in #5722
- Revert "require CanManageAgentSession grant for remote session (#5487)" by @chenghao-mou in #5714
- feat(gemini llm): add media resolution option to LLM and RealtimeModel by @csanz91 in #5712
- (inworld tts): add language param by @tinalenguyen in #5723
- fix(core): surface real http_context error from STT streams by @longcw in #5709
- docs: clarify function tools executed event pairing by @nightcityblade in #5701
- feat(workflows): expose dtmf and ringing_timeout on WarmTransferTask by @a-gasior in #5721
- fix(agents): await realtime auto tool reply in RunResult by @longcw in #5702
- chore(amd): update default models and default behavior by @chenghao-mou in #5713
- Add Perplexity LLM plugin by @jliounis in #5610
- fix: raise on unexpected ElevenLabs websocket close by @nightcityblade in #5729
- fix(agents): clear pending auto tool reply future on timeout by @u9g in #5725
- (azure + perplexity): bump versions by @tinalenguyen in #5731
- fix: do not republish tracks on reconnect by @davidzhao in #5698
- chore(amd): add default amd prediction log by @chenghao-mou in #5732
- feat(rime): add WebSocket streaming TTS support by @mcullan in #5663
- livekit-agents@1.5.9 by @github-actions[bot] in #5733
New Contributors
- @jmuk made their first contribution in #5591
- @shizhigu made their first contribution in #5598
- @SuperMarioYL made their first contribution in #5529
- @nightcityblade made their first contribution in #5701
- @a-gasior made their first contribution in #5721
- @jliounis made their first contribution in #5610
Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.5.8...livekit-agents@1.5.9
livekit-agents@1.5.8
What's Changed
- feat(interruption): barge-in cooldown window for corrections by @chenghao-mou in #5269
- fix(amd): amd improvement (AGT-2777) by @chenghao-mou in #5584
- fix(warm_transfer): don't fall back to env var when sip_connection is set by @longcw in #5619
- fix(aws): wait for stream ready before sending audio start event by @lanazhang in #5626
- Fix Missing user message metrics (MetricsReport) due to early returns in _user_turn_completed_task and no initialization in on_end_of_turn by @hudson-worden in #5437
- fix(amd): missing stt start by @chenghao-mou in #5633
- fix: reduce overly eager call ending behavior by @davidzhao in #5630
- feat(fishaudio): use websocket API for faster inference by @davidzhao in #5629
- fix(observability): retry session recording upload by @paulwe in #5627
- fix(openai realtime): reject pending response future on error event by @longcw in #5576
- feat(inference): propagate STT extra to SpeechData.metadata by @russellmartin-livekit in #5639
- Update README.md by @theomonnom in #5640
- fix(amd): reset timer for late stt transcript by @chenghao-mou in #5637
- fix: end Runway realtime sessions on shutdown by @robinandeer in #5623
- ci(examples): add deploy workflow by @tinalenguyen in #5641
- feat(amd): add remote session event for amd AGT-2828 by @chenghao-mou in #5621
- Add Soniox TTS plugin by @matejmarinko-soniox in #5543
- (inworld tts): add new model by @tinalenguyen in #5646
- livekit-agents@1.5.8 by @github-actions[bot] in #5647
New Contributors
- @lanazhang made their first contribution in #5626
Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.5.7...livekit-agents@1.5.8
livekit-agents@1.5.7
What's Changed
- fix(openai): forward session.update on RealtimeModel.update_options by @longcw in #5531
- fix(transcription): seed _start_wall_time fallback in aclose by @longcw in #5532
- Fix realtime reply generation after interruption by @jayeshp19 in #5526
- fix(cartesia): Move API key from Query Params to Headers by @charlotte-zhuang in #5516
- deepgram-stt: report connection-lifetime remainder so usage matches billing by @joaquinhuigomez in #5506
- feat(room-io): add json_format option for timed transcription output by @longcw in #5472
- feat(inference): add inference_class option to LLM for priority routing by @adrian-cowham in #5517
- chore: update default model for Anthropic LLM by @royalfig in #5539
- fix(voice): pause output when user starts speaking during thinking by @longcw in #5535
- feat(openai): add gpt-5.4-mini to model registry by @xtreme-sameer-vohra in #5540
- feat(assemblyai): warn when audio stops flowing to the WebSocket by @gsharp-aai in #5504
- feat(tts): add support for timestamps in Inference by @chenghao-mou in #5534
- docs: clarify RunResult.events testing surface by @Rul1an in #5525
- feat(stt): back-date START_OF_SPEECH onset via server-provided timestamp by @gsharp-aai in #5479
- feat(aws): add auto language detection and mid-stream language switchโฆ by @cldsime in #5435
- (release workflow): add docs job by @tinalenguyen in #5551
- (liveavatar): add video_quality param by @tinalenguyen in #5552
- Add avatartalk plugin to optional dependencies by @bcherry in #5550
- fix(soniox): emit PREFLIGHT_TRANSCRIPT for preemptive LLM generation by @octo-patch in #5553
- feat(xai): support model selection in realtime, default to grok-voice-think-fast-1.0 by @Hormold in #5548
- Remove 'distil-whisper-large-v3-en' from STTModels by @vedevpatel in #5537
- fix: don't swallow _ExitCli during shutdown by @lawrence3699 in #5519
- feat: expose provider request ids on STT/TTS/LLM spans for debugging by @longcw in #5546
- chore(openai): remove STT.with_groq constructor by @davidzhao in #5555
- chore(deps): update github actions (major) by @renovate[bot] in #5558
- feat(mcp): allow updating headers on MCPServerHTTP by @longcw in #5559
- feat(metrics): add playback_latency metric by @longcw in #5524
- feat(endpointing): expose dynamic endpointing alpha parameter (AGT-2764) by @chenghao-mou in #5491
- fix(smallestai): use close_stream signal to properly terminate STT session by @harshitajain165 in #5562
- Hotfix; Updated default Avatar ID by @hari-truviz in #5568
- fix(gemini live): use parameters instead of parameters_json_schema for raw schema function tools by @longcw in #5560
- Stuck aclose() activity leading to stuck handoff by @svacatalisan in #4649
- fix(async_toolset): respect allow_interruptions when cancelling tool calls by @longcw in #5570
- update livekit rtc to 1.1.7 by @davidzhao in #5572
- feat(mistral): add connectors provider tool & fix realtime STT custom headers by @jeanprbt in #5575
- feat(openai): expose verbosity in Responses LLM by @AlessandroElyos in #5583
- fix(mistral): use conversations API statelessly by @TheCodingCvrlo in #5586
- support LIVEKIT_AGENT_NAME env var by @theomonnom in #5571
- fix(recorder): use libopus when possible by @chenghao-mou in #5579
- docs: add LIVEKIT_AGENT_NAME to environment variables by @detail-app[bot] in #5599
- fix(elevenlabs): use audio_format query param for STT realtime by @longcw in #5574
- fix: clear stale paused speech state across generation steps by @longcw in #5594
- fix: cancel Runway realtime sessions on shutdown by @robinandeer in #5612
- fix(inference): skip unknown message warning and rename event name by @chenghao-mou in #5614
- feat: add SLNG plugin for STT and TTS by @metehan-slng in #5249
- livekit-agents@1.5.7 by @github-actions[bot] in #5615
New Contributors
- @charlotte-zhuang made their first contribution in #5516
- @xtreme-sameer-vohra made their first contribution in #5540
- @Rul1an made their first contribution in #5525
- @cldsime made their first contribution in #5435
- @octo-patch made their first contribution in #5553
- @vedevpatel made their first contribution in #5537
- @lawrence3699 made their first contribution in #5519
- @svacatalisan made their first contribution in #4649
- @AlessandroElyos made their first contribution in #5583
- @TheCodingCvrlo made their first contribution in #5586
- @detail-app[bot] made their first contribution in #5599
- @metehan-slng made their first contribution in #5249
Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.5.6...livekit-agents@1.5.7
livekit-agents@1.5.6
What's Changed
- Add Qwen 3 TTS support for Simplismart-livekit plugin by @simplipratik in #5474
- Add Inworld STT provider to livekit-plugins-inworld by @cshape in #5451
- (minimax): add new TTS models by @tinalenguyen in #5518
- feat(smallestai): add Pulse STT with real-time streaming and batch transcription by @harshitajain165 in #5312
- feat(avatar): add playback_started RPC for remote avatar workers by @longcw in #5511
- fix: clear _hist buffer in MovingAverage.reset() to prevent stale averages by @kuishou68 in #5522
- feat(mistral): migrate LLM to Conversations API with provider tools support by @jeanprbt in #5527
- livekit-agents@1.5.6 by @github-actions[bot] in #5528
New Contributors
- @simplipratik made their first contribution in #5474
- @harshitajain165 made their first contribution in #5312
- @kuishou68 made their first contribution in #5522
Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.5.5...livekit-agents@1.5.6
livekit-agents@1.5.5
What's Changed
- feat(inference): STT diarization capabilities and speaker_id on TimedString, add xAI TTS support for inference by @russellmartin-livekit in #5438
- [inworld] timed_string to no longer have trailing spaces by @ianbbqzy in #5470
- fix(examples): update e2ee.py to use encryption kwarg and env var by @aryeila in #5469
- chore(deps): update dependency pillow to v12.2.0 [security] by @renovate[bot] in #5440
- fix(tests): update preemptive_generation mock to use dict by @longcw in #5468
- fix(telemetry): bound OTel provider shutdown to avoid watchdog kills by @theomonnom in #5471
- feat(assemblyai): log connection lifecycle, silence, and session correlators by @dlange-aai in #5476
- fix: strip markdown emphasis adjacent to punctuation by @carschandler in #5481
- (aws realtime): add expiry check for cached credentials by @tinalenguyen in #5485
- (hedra): note deprecation in readme by @tinalenguyen in #5475
- (deepgram sttv2): add flux-general-multi support by @tinalenguyen in #5486
- (xai stt): expose endpointing param to user by @tinalenguyen in #5493
- fix(room-io): ownership-aware FrameProcessor lifecycle management by @longcw in #5467
- (openai responses): drop prompt_cache_retention in received responses by @tinalenguyen in #5502
- feat(avatar): add AvatarSession base class, warn on sync mis-wire by @longcw in #5499
- livekit-agents@1.5.5 by @github-actions[bot] in #5503
New Contributors
- @carschandler made their first contribution in #5481
Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.5.4...livekit-agents@1.5.5
livekit-agents@1.5.4
New features
Preemptive generation: added more granular options
Refines default behavior for preemptive generation to better handle long or intermittent user speech, reducing unnecessary downstream inference and associated cost increases.
Also introduces PreemptiveGenerationOptions for developers who need fine-grained control over this behavior.
class PreemptiveGenerationOptions(TypedDict, total=False):
"""Configuration for preemptive generation."""
enabled: bool
"""Whether preemptive generation is enabled. Defaults to ``True``."""
preemptive_tts: bool
"""Whether to also run TTS preemptively before the turn is confirmed.
When ``False`` (default), only LLM runs preemptively; TTS starts once the
turn is confirmed and the speech is scheduled."""
max_speech_duration: float
"""Maximum user speech duration (s) for which preemptive generation
is attempted. Beyond this threshold, preemptive generation is skipped
since long utterances are more likely to change and users may expect
slower responses. Defaults to ``10.0``."""
max_retries: int
"""Maximum number of preemptive generation attempts per user turn.
The counter resets when the turn completes. Defaults to ``3``."""What's Changed
Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.5.3...livekit-agents@1.5.4