When we train and finetune ASR models with different dataset sizes and it’s very handy to operate iteration of weight updates nor epoch.
Let’s say we have a 5000 hours dataset and a 100 hours finetune dataset. With a batch size of 1024 samples, we will have 40 and 400 train epochs, respectively. Which is less convenient than acting in training iterations, when there are 200,000 and 200,000 iterations, respectively.
What in my opinion will need to be done?
Add iteration number as the counter in a core Brain class.
Add iterations as a maximum number of iterations to yaml configs.
Do exit training loop when the required amount is reached.
What do you think?