I want to get quick AI audio response from the user voice
I heard that Deepgram API is the best solution for this, so I integrated Deepgram API into our system.
It has long delay time to generate AI audio responses.
const audioData = event.target.result;
await axios.post("https://api.deepgram.com/v1/listen", audioData, { headers: headers })
.then((response) => {
if (response.data) {
transcript = response.data.results.channels[0].alternatives[0].transcript;
}
}).catch((error) => {
console.error("Error while transcripting:", error); // Handle errors
});
// Get response from OpenAI GPT-4o
const completion = await openai.chat.completions.create({
messages: messages,
model: "gpt-4o",
temperature: 0.2,
});
ai_response = completion.choices[0].message.content;
// Generate audio from text and play
const config = {
headers: {
Authorization: `Token ${process.env.DEEPGRAM_API_KEY}`,
"Content-Type": "application/json",
},
};
const data = {
text: ai_response,
};
const response = await fetch("https://api.deepgram.com/v1/speak?model=aura-zeus-en",
{
method: "POST",
headers: { ...config.headers },
body: JSON.stringify(data)
}
);
if (!response.ok) {
throw new Error(`HTTP error! Status: ${response.status}`);
}
const audioContext = new (window.AudioContext || window.webkitAudioContext)();
const source = audioContext.createBufferSource();
// Fetch the audio data as an ArrayBuffer
const arrayBuffer = await response.arrayBuffer();
audioContext.decodeAudioData(
arrayBuffer,
(buffer) => {
source.buffer = buffer;
source.connect(audioContext.destination);
source.start(0);
},
(e) => {
console.log("Error with decoding audio data" + e.err);
}
);
This is my current code using Deepgram API, but I am not sure what is the issue.
I integrated Deepgram API and OpenAI API in React.js-frontend for quick speed, but yet it has long delay.
What are the best solutions for quick AI audio response?