Hi, congratulations to the team for completing such an ambitious project! The timing is great too, given the sudden rise in live audio and podcasting.
I have skimmed through the docs and recipes, and I do wonder how you are looking to tackle tasks such as VAD, Speaker Change Detection, and Overlapping Speaker Detection. My understanding is that as far as diarisation goes you currently assume the input is a dataset of speech with speakers non-overlapping and pre-labelled. I am particularly interested in solving Speaker Change Detection. Any work currently in that direction?