Code for paper Significance of neural phonotactic models for large-scale spoken language identification (IJCNN 2017)
This package lets you perform language identification over spoken utterances. Following steps explain how to setup the package, train the model and then test over unseen utterances. Please read the paper linked aboved for more details.
The setup requires installation of RNNLM and SRILM. Please download and install them before proceeding to install this package.
- RNNLM: untar
src/rnnlm-0.4b.tgzand runmakeinside the extracted folder to build the RNNLM package. Update the RNNLM path insrc/train_phonotactics.shandsrc/test.sh. - SRILM: Follow the instructions at given link to install.
Finally you must have have ngram-count, ngram and rnnlm in your PATH.
Next we need to setup python environment. Follow the commands to setup the environment.
cd lid-convex-comb virtualenv venv . venv/bin/activate pip install -r src/requirements.txt cd src python train.py train.txt train.py requires train.txt which has the following format:
audio_file_path_1 <tab> language_tag1 audio_file_path_2 <tab> language_tag1 audio_file_path_3 <tab> language_tag1 audio_file_path_4 <tab> language_tag2 audio_file_path_5 <tab> language_tag2 audio_file_path_6 <tab> language_tag2 . . . Create another file named test.txt with similar format. After training is finished, make sure data/langs directory has a unique directory for each language for which you trained. Now run the following commands for testing on unseen data.
cd src python test.py test.txt Please contact me for any issues at brijmohanlal.s [at] research [dot] iiit [dot] ac [dot] in.