Fbank and MFCC modules compute an extra frame


I am new to speechbrain and have a pretty basic question.
I was testing out the Fbank and MFCC modules and noticed that I get an extra frame of output. E.g. for wav input of size (16,48000) I am expecting output size of (16,300,80) for 80-dim filterbanks. However, the output is of size (16,301,80). Is there any reason for this?
The sampling rate is 16k and I have set left & right frames to zero and deltas to false.

Thanks in advance!

Hi, we are investigating this

Thanks! It seems like even when I try with pytorch’s kaldi compliant fbank module I get 298 frames for 48k samples (16k sampling). Both don’t seem to hurt model training / performance

Well, yeah, it shouldn’t heart the perf as we get SotA with SpeechBrain using these. It might be due to some padding somewhere.