Feature request. Iteration number in training loop.

Hello!

Motivation:

When we train and finetune ASR models with different dataset sizes and it’s very handy to operate iteration of weight updates nor epoch.

Let’s say we have a 5000 hours dataset and a 100 hours finetune dataset. With a batch size of 1024 samples, we will have 40 and 400 train epochs, respectively. Which is less convenient than acting in training iterations, when there are 200,000 and 200,000 iterations, respectively.

What in my opinion will need to be done?
Add iteration number as the counter in a core Brain class.
Add iterations as a maximum number of iterations to yaml configs.
Do exit training loop when the required amount is reached.

What do you think?

1 Like

I support this request :+1: Was surprised when I realized this wasn’t done already.

Alright, we have this feature in some recipe (see Transformer ASR that reports step). But it is true that, by default, we report epochs. If you feel comfortable with SpeechBrain enough, we would be happy to see a PR on that topic :slight_smile: It could be inspired by the recipes that report this number of steps