Isolating a particular person speech in a group conversation

Hi I am new to speech recognition.

I was wondering if it is possible for me to use speech rain api to 1) isolate a particular person’s speech from group conversations. 2) next perform speech to text to obtain the speech of the particular person’s that I am interested in?

May I know how should I go about it?

Should I run the wav file through 1) speech diarization to separate the speakers, then 2) speech identification to identify the particular individual that I am interested in and then finally perform the 3) speech recognition.

Hope you can share please?