Handling disfluency/non-speech markers for training tokenizer

This question is regarding handling various disfluency/non-speech markers like laugh, cough, breath, gasp, etc. Most of these markers are enclosed within “<” and “>” like “”. In Kaldi, I usually would add these to silence_phones.txt and add equivalent mapping to lexicon.txt. But I am not sure if any of the token types ([“unigram”, “bpe”, “char”]) can handle such markers for E2E ASR. What can be a potential solution here?