AMI Diarization

Hello! So basically i want to do the speaker diarization using audio file which gives the Speaker wise conversation as text … Is it possible using speechbrain ?
And continuing to the question from github , I downloaded the AMI manual annotations v1.6.2 from http://groups.inf.ed.ac.uk/ and added it to a folder named amicorpus and changed the path in yaml as follows :

data_folder: /content/drive/MyDrive/amicorpus
manual_annot_folder: /miniscratch/nauman/ami_public_manual/
output_folder: results/ami/sd_xvector/
save_folder: !ref <output_folder>/save
device: β€˜cuda:0’

Still getting the same error in experiments.py file
UnboundLocalError: local variable β€˜out_rttm_dir’ referenced before assignment.
Not sure if I’m proceeding correctly. KIndly help.

1 Like

Hi,

You also need to download the ground truth that will be used for oracle VAD setup and final evaluation.

You may download annotations ( AMI manual annotations v1.6.2) from here : AMI Corpus Download

Please change your manual_annot_folder path accordingly.

Hello Nauman, and vr14:

I’m in a similar situation; I believe I’ve followed the setup instructions diligently, but I’m also getting the same error when running the experiment. I’ll try to enumerate my relevant state below:

OS: Ubuntu 20.04, with CUDA installed and verified working.
AMI Dataset lives at: demarco/datasets/ami, with a tree structure of:

(py38) demarco@demarco-ubuntu:~/dev/datasets/ami$ tree . --filelimit 27
.
β”œβ”€β”€ corpus
β”‚   └── ES2002a
β”‚       β”œβ”€β”€ audio
β”‚       β”‚   └── ES2002a.Mix-Headset.wav
β”‚       └── video
β”‚           β”œβ”€β”€ ES2002a.Closeup1.avi
β”‚           β”œβ”€β”€ ES2002a.Closeup2.avi
β”‚           β”œβ”€β”€ ES2002a.Closeup3.avi
β”‚           β”œβ”€β”€ ES2002a.Closeup4.avi
β”‚           β”œβ”€β”€ ES2002a.Corner.avi
β”‚           └── ES2002a.Overhead.avi
└── manual_annot
    β”œβ”€β”€ 00README_MANUAL.txt
    β”œβ”€β”€ abstractive [142 entries exceeds filelimit, not opening dir]
    β”œβ”€β”€ AMI-metadata.xml
    β”œβ”€β”€ argumentation
    β”‚   β”œβ”€β”€ ae [376 entries exceeds filelimit, not opening dir]
    β”‚   β”œβ”€β”€ ar [94 entries exceeds filelimit, not opening dir]
    β”‚   └── dis [95 entries exceeds filelimit, not opening dir]
    β”œβ”€β”€ configuration
    β”‚   β”œβ”€β”€ amiConfig.xml
    β”‚   └── amiSubConfig.xml
    β”œβ”€β”€ corpusdoc [73 entries exceeds filelimit, not opening dir]
    β”œβ”€β”€ corpusResources
    β”‚   β”œβ”€β”€ meetings.xml
    β”‚   └── participants.xml
    β”œβ”€β”€ decision
    β”‚   └── manual [47 entries exceeds filelimit, not opening dir]
    β”œβ”€β”€ dialogueActs [695 entries exceeds filelimit, not opening dir]
    β”œβ”€β”€ disfluency [160 entries exceeds filelimit, not opening dir]
    β”œβ”€β”€ extractive [274 entries exceeds filelimit, not opening dir]
    β”œβ”€β”€ focus [56 entries exceeds filelimit, not opening dir]
    β”œβ”€β”€ handGesture [61 entries exceeds filelimit, not opening dir]
    β”œβ”€β”€ headGesture [173 entries exceeds filelimit, not opening dir]
    β”œβ”€β”€ LICENCE.txt
    β”œβ”€β”€ manifest_1.7.html
    β”œβ”€β”€ MANIFEST_MANUAL.txt
    β”œβ”€β”€ movement [498 entries exceeds filelimit, not opening dir]
    β”œβ”€β”€ namedEntities [468 entries exceeds filelimit, not opening dir]
    β”œβ”€β”€ ontologies
    β”‚   β”œβ”€β”€ ae-types.xml
    β”‚   β”œβ”€β”€ ap-types.xml
    β”‚   β”œβ”€β”€ ar-types.xml
    β”‚   β”œβ”€β”€ chunks.xml
    β”‚   β”œβ”€β”€ chunk-types.xml
    β”‚   β”œβ”€β”€ da-types.xml
    β”‚   β”œβ”€β”€ default-topics.xml
    β”‚   β”œβ”€β”€ dis-types.xml
    β”‚   β”œβ”€β”€ dsfl-types.xml
    β”‚   β”œβ”€β”€ floor-types.xml
    β”‚   β”œβ”€β”€ foa-targets.xml
    β”‚   β”œβ”€β”€ leg-targets.xml
    β”‚   β”œβ”€β”€ ne-types.xml
    β”‚   β”œβ”€β”€ rse-types.xml
    β”‚   β”œβ”€β”€ rsr-types.xml
    β”‚   β”œβ”€β”€ subj-types.xml
    β”‚   └── you-types.xml
    β”œβ”€β”€ participantRoles
    β”‚   β”œβ”€β”€ ES2002d.A.role.xml
    β”‚   β”œβ”€β”€ ES2002d.B.role.xml
    β”‚   β”œβ”€β”€ ES2002d.C.role.xml
    β”‚   β”œβ”€β”€ ES2002d.D.role.xml
    β”‚   β”œβ”€β”€ ES2008b.A.role.xml
    β”‚   β”œβ”€β”€ ES2008b.B.role.xml
    β”‚   β”œβ”€β”€ ES2008b.C.role.xml
    β”‚   β”œβ”€β”€ ES2008b.D.role.xml
    β”‚   β”œβ”€β”€ ES2008d.A.role.xml
    β”‚   β”œβ”€β”€ ES2008d.B.role.xml
    β”‚   β”œβ”€β”€ ES2008d.C.role.xml
    β”‚   β”œβ”€β”€ ES2008d.D.role.xml
    β”‚   β”œβ”€β”€ ES2009d.A.role.xml
    β”‚   β”œβ”€β”€ ES2009d.B.role.xml
    β”‚   β”œβ”€β”€ ES2009d.C.role.xml
    β”‚   β”œβ”€β”€ ES2009d.D.role.xml
    β”‚   β”œβ”€β”€ IS1003d.A.role.xml
    β”‚   β”œβ”€β”€ IS1003d.B.role.xml
    β”‚   β”œβ”€β”€ IS1003d.C.role.xml
    β”‚   └── IS1003d.D.role.xml
    β”œβ”€β”€ participantSummaries [323 entries exceeds filelimit, not opening dir]
    β”œβ”€β”€ resource.xml
    β”œβ”€β”€ segments [687 entries exceeds filelimit, not opening dir]
    β”œβ”€β”€ topics [139 entries exceeds filelimit, not opening dir]
    β”œβ”€β”€ words [687 entries exceeds filelimit, not opening dir]
    └── youUsages [63 entries exceeds filelimit, not opening dir]

30 directories, 54 files

Note I simply downloaded the Edinburgh ES 2002a meeting, selected β€œLow-size DivX videos” and β€œHeadset mix”.

I installed extra_requirements.txt and updated hparams/ecapa_tdnn.yaml as follows:

seed: 1234
__set_seed: !apply:torch.manual_seed [!ref <seed>]

# Folders
# data: http://groups.inf.ed.ac.uk/ami/download/
data_folder: /home/demarco/dev/datasets/ami/corpus/
manual_annot_folder: /home/demarco/dev/datasets/ami/manual_annot/
output_folder: results/ami/sd_ecapa_tdnn
save_folder: !ref <output_folder>/save
device: 'cuda:0'

I only changed the data_folder and manual_annot_folder paths, but also removed a trailing / from output_folder; I noted under a debugger that full_ref_rttm_file had a double // in its path; unclear if that could negatively affect things.

My error output is identical:

(py38) demarco@demarco-ubuntu:~/dev/when/ml/speechbrain/recipes/AMI/Diarization$ python experiment.py hparams/ecapa_tdnn.yaml 
speechbrain.core - Beginning experiment!
speechbrain.core - Experiment folder: results/ami/sd_ecapa_tdnn
speechbrain.pretrained.fetching - Fetch embedding_model.ckpt: Using existing file/symlink in results/ami/sd_ecapa_tdnn/save/embedding_model.ckpt.
speechbrain.utils.parameter_transfer - Loading pretrained files for: embedding_model
__main__ - Tuning for nn (Multiple iterations over AMI Dev set)
__main__ - Diarizing dev set
0it [00:00, ?it/s]
speechbrain.core - Exception:
Traceback (most recent call last):
  File "experiment.py", line 443, in <module>
    best_nn = dev_nn_tuner(full_csv, "dev")
  File "experiment.py", line 292, in dev_nn_tuner
    concate_rttm_file = diarize_dataset(
  File "experiment.py", line 229, in diarize_dataset
    concate_rttm_file = out_rttm_dir + "/sys_output.rttm"
UnboundLocalError: local variable 'out_rttm_dir' referenced before assignment

I’m not sure how to proceed. Any help you can offer would be greatly appreciated. Thank you in advance for your help and your contributions to this excellent library.

@nauman I am getting the same error output. Looking for an answer to the above.

Hi,
This may occur mainly when paths are not proper. Maybe I can add a check for that in the code itself. Let me confirm re-generate this error again. I will get back in a couple of days. Meanwhile, if anyone finds the answer, please feel free to share it or create PR in GitHub.