Speech Recognition

I have audio and time-stamped transcripts of 300 different people speaking for about 15 minutes. I need code where I can input an audio file, and the code will output to me which of the 300 people is speaking and an estimated confidence level.

This solution must be linux-based.

Thanks