Modern speech recognition is clunky and often requires massive amounts of annotations and transcriptions to help understand what you're referencing. There might, however, be a more natural way: teaching the algorithms to recognize things much like yo…