[egs] Add recipes for Speakers in the Wild (SITW) #2422
Merged
Add this suggestion to a batch that can be applied as a single commit. This suggestion is invalid because no changes were made to the code. Suggestions cannot be applied while the pull request is closed. Suggestions cannot be applied while viewing a subset of changes. Only one suggestion per line can be applied in a batch. Add this suggestion to a batch that can be applied as a single commit. Applying suggestions on deleted lines is not supported. You must change the existing code in this line in order to create a valid suggestion. Outdated suggestions cannot be applied. This suggestion has been applied or marked resolved. Suggestions cannot be applied from pending reviews. Suggestions cannot be applied on multi-line comments. Suggestions cannot be applied while the pull request is queued to merge. Suggestion cannot be applied right now. Please check back later.
The PR adds an i-vector and an x-vector recipe for Speakers in the Wild (SITW) (http://www.speech.sri.com/projects/sitw/). The results using x-vectors are currently state-of-the-art, as far as I know.
The recipe is trained on VoxCeleb1 and VoxCeleb2 (http://www.robots.ox.ac.uk/~vgg/data/voxceleb/). Sixty speakers in VoxCeleb1 overlap with SITW. We remove those from VoxCeleb1 prior to training.
FYI @entn-at, @danpovey, @leibny