Skip to content

Conversation

@entn-at
Copy link
Contributor

@entn-at entn-at commented May 17, 2018

  • Add missing data prep scripts for MUSAN for callhome_diarization
  • Copy vad.scp (required by vad_to_segments.sh for SRE data) and segments (required by extract_xvectors.sh) to *_cmn data folders after prepare_feats.sh. The missing segments file also makes the call to fix_data_dir.sh after prepare_feats.sh fail.
  • Fix check before create_split_dir.pl
Add missing data prep scripts for MUSAN for callhome_diarization; Copy vad.scp and segments to *_cmn data folders after prepare_feats; Fix check before create_split_dir
@david-ryan-snyder
Copy link
Contributor

david-ryan-snyder commented May 18, 2018

Thanks! Is this ready to merge, @entn-at?

@entn-at
Copy link
Contributor Author

entn-at commented May 18, 2018

Yes, it's ready to merge! I ran the entire recipe (DER: 7.47%, but I didn't use SWB2-Ph1)

@david-ryan-snyder
Copy link
Contributor

Thanks! @danpovey, could you merge this when you get a chance?

@entn-at, is the DER: 7.47% using the tuned stopping threshold or the oracle number of speakers?

@entn-at
Copy link
Contributor Author

entn-at commented May 18, 2018

The DER was 7.47% using the oracle number of speakers. Using the tuned stopping threshold it was 8.66%. Both are higher than the ones given in run.pl, but I assume that's because I didn't use Switchboard-2 Phase-1 as part of the training and therefore had fewer speakers (4623) and utterances for training the x-vector model.

@david-ryan-snyder
Copy link
Contributor

Cool, thanks for the info.

@danpovey danpovey merged commit 2ad8d78 into kaldi-asr:master May 18, 2018
@entn-at entn-at deleted the fix-xvector-scripts branch May 18, 2018 22:09
dpriver pushed a commit to dpriver/kaldi that referenced this pull request Sep 13, 2018
Add missing data prep scripts for MUSAN for callhome_diarization; Copy vad.scp and segments to *_cmn data folders after prepare_feats; Fix check before create_split_dir
Skaiste pushed a commit to Skaiste/idlak that referenced this pull request Sep 26, 2018
Add missing data prep scripts for MUSAN for callhome_diarization; Copy vad.scp and segments to *_cmn data folders after prepare_feats; Fix check before create_split_dir
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

3 participants