I’m confusing abount the code in speechbrain/nnet/loss/transducer_loss.py, line 254:
@staticmethod def forward(ctx, log_probs, labels, T, U, blank, reduction): log_probs = log_probs.detach()
what’s the reason in detaching log_probs? In my opinion, by doing this, the backward gradient could not influence log_probs, so the whole network would not be tuned.
please give me some insight, thanks very much!