Input audio formats?


Which input audio formats (wav, flac, opus, mp3 etc.) does speechbrain support? I couldn’t find any information about it in documentation. If there is any, please direct me to there.
And can we specify on the fly conversion of audio files in “wav” (path) field of the data files? i.e. ( | <sox/ffmpeg conversion> - ) ?

Many thanks,

SpeechBrain relies on TorchAudio. Hence any format accepted by torch audio and your corresponding backend :slight_smile: You can do it in the data_function pipeline.

    def audio_pipeline(mp3):
       \\ do your conversion here.
        sig = sb.dataio.dataio.read_audio(wav)
        return sig
1 Like