Hi There,
I am doing a project and one of the element in the pipeline is to get phonemes from the audio data. I am trying to get phonemes directly from the audio rather than performing grapheme-to-phoneme conversion.
Is there any SpeechBrain pretrained model that I can use in my pipeline to get phonemes from the audio data? It would be super helpful if I could get directed to any other tools as well.
Thank you!