How can i evaluate my lm?

Hi. I have trained a language model with my own data with both RNN and Transformer architectures like shown in the LibriSpeech recipe. But i am not sure how to compare this two models. As i see there is no metric for measure the LM accuracy in SpeechBrain except loss. What path should I follow in order to compare LM models I have trained? And how can I inference the LM predictions?

Thanks.