Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .changeset/lemonslice-audio-output-order.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
'@livekit/agents-plugin-lemonslice': patch
---

fix(lemonslice): bind avatar audio output before starting upstream session
12 changes: 7 additions & 5 deletions plugins/lemonslice/src/avatar.ts
Original file line number Diff line number Diff line change
Expand Up @@ -186,8 +186,8 @@ export class AvatarSession extends voice.AvatarSession {
*
* This method:
* 1. Creates a LiveKit token for the avatar participant
* 2. Calls the LemonSlice API to start the avatar session
* 3. Configures the agent's audio output to stream to the avatar
* 2. Configures the agent's audio output to stream to the avatar
* 3. Calls the LemonSlice API to start the avatar session
*
* @param agentSession - The agent session to connect to the avatar
* @param room - The LiveKit room where the avatar will join
Expand Down Expand Up @@ -249,9 +249,8 @@ export class AvatarSession extends voice.AvatarSession {

const livekitToken = await at.toJwt();

this.#logger.debug('starting avatar session');
const sessionId = await this.startAgent(livekitUrl, livekitToken);

// Bind audio output before the upstream HTTP call so subsequent generations route to
// the avatar identity while DataStreamAudioOutput waits for the video track.
agentSession.output.audio = new voice.DataStreamAudioOutput({
room,
destinationIdentity: this.avatarIdentity,
Expand All @@ -260,6 +259,9 @@ export class AvatarSession extends voice.AvatarSession {
waitPlaybackStart: true,
});

this.#logger.debug('starting avatar session');
const sessionId = await this.startAgent(livekitUrl, livekitToken);
Comment on lines 259 to +263
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Audio output left in broken state if startAgent HTTP call fails

The reordering sets agentSession.output.audio to a new DataStreamAudioOutput (line 254) before the startAgent HTTP call (line 263). If startAgent throws — either an APIStatusError for non-retryable errors (line 319) or an APIConnectionError after all retries are exhausted (line 336) — the exception propagates out of start(), but agentSession.output.audio has already been permanently replaced with a DataStreamAudioOutput targeting an avatar participant that will never join the room. The DataStreamAudioOutput will then wait indefinitely in waitForParticipant (agents/src/voice/avatar/datastream_io.ts:128) for a participant that never connects, effectively making all subsequent audio output hang. The previous code (and all other avatar plugins — hedra, trugen, bey, anam, tavus, runway) set the audio output only after the upstream session creation succeeds, avoiding this corrupted-state-on-failure issue.

(Refers to lines 254-263)

Prompt for agents
The audio output is assigned before startAgent, so if startAgent throws, agentSession.output.audio is left pointing to a DataStreamAudioOutput for a non-existent avatar participant. The fix should save the original audio output before the assignment and restore it in a catch/finally block if startAgent fails. Something like:

const previousAudio = agentSession.output.audio;
agentSession.output.audio = new voice.DataStreamAudioOutput({...});
try {
  const sessionId = await this.startAgent(livekitUrl, livekitToken);
  return sessionId;
} catch (e) {
  agentSession.output.audio = previousAudio;
  throw e;
}

Relevant files: plugins/lemonslice/src/avatar.ts (start method, lines 198-266), agents/src/voice/avatar/datastream_io.ts (DataStreamAudioOutput constructor and _start method).
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.


return sessionId;
}

Expand Down
Loading