Using pretrained croatian model for speech recognition

Hello,
I am new in this area so I was wondering if you could give me a more detailed example of how to use pretrained models in python like facebook/wav2vec2-base-10k-voxpopuli-ft-hr · Hugging Face .
When I try to run the example code I get the following error:
ValueError: BuilderConfig hr not found. Available: ['ab', 'ar', 'as', 'br', 'ca', 'cnh', 'cs', 'cv', 'cy', 'de', 'dv', 'el', 'en', 'eo', 'es', 'et', 'eu', 'fa', 'fi', 'fr', 'fy-NL', 'ga-IE', 'hi', 'hsb', 'hu', 'ia', 'id', 'it', 'ja', 'ka', 'kab', 'ky', 'lg', 'lt', 'lv', 'mn', 'mt', 'nl', 'or', 'pa-IN', 'pl', 'pt', 'rm-sursilv', 'rm-vallader', 'ro', 'ru', 'rw', 'sah', 'sl', 'sv-SE', 'ta', 'th', 'tr', 'tt', 'uk', 'vi', 'vot', 'zh-CN', 'zh-HK', 'zh-TW']
Also how could I use this pretrained model with my own data (without downloading 56GB)?

Hi, I am not sure to understand properly your question. Your link points toward something from Facebook, not SpeechBrain.

Oh I thought I could use any huggingface pretrained model. Can you tell me then if there is a pretrained model for croatian language that I could use for speech to text tasks with speechbrain?

I don’t think so :frowning: You could do one if you start from a pretrained w2v2 XLSR or voxpopuli for instance, and then fine-tune with few hours of data ?