Using OpenAI’s real-time api with server-vad mode

I’m trying to setup the real-time api, but I’m a little confused on how the audio to audio events are suppose to work in server_vad mode.

Currently I do the following:

  1. setup the real-time client
const client = new RealtimeClient({ apiKey: process.env.OPENAI_API_KEY });
  1. set the client session:
  instructions: "be nice and helpful",
  input_audio_transcription: { model: 'whisper-1' },
  turn_detection: { type: "server_vad" },
});
  1. listen for events:
  console.log("Realtime Event: ", event);
});
  1. connect:
client.connect()
  1. send some audio:
  client.appendInputAudio(data)

However, I never receive any events other than the input_audio_append.event. Is there another step I need to take to ensure I receive a response?