Adding chime5 baseline recipe #2262

jtrmal · 2018-03-07T18:40:49Z

This is a joint work with @sw005320 and me
Dan, we are still working on the chain/nnet trainining but we thought we will go ahead and create PR now, so that at least the data preparation and gmm can go through review

danpovey

A few comments.

danpovey · 2018-03-07T21:59:48Z

egs/chime5/s5/conf/chime5.cfg

@@ -0,0 +1,50 @@
+#BeamformIt sample configuration file for AMI data (http://groups.inf.ed.ac.uk/ami/download/)
+


can you rename to beamformit_chime5.cfg to clarify that it relates to that?

danpovey · 2018-03-07T22:01:06Z

egs/chime5/s5/conf/decode.config

@@ -0,0 +1,2 @@
+beam=11.0 # beam for decoding. Was 13.0 in the scripts.


You should probably delete this (also from the source of these scripts) unless it's being used. I believe it is not used unless an option like "--config conf/decode.config" is given to decoding scripts.

danpovey · 2018-03-07T22:01:51Z

egs/chime5/s5/local/chain/tuning/run_tdnn_1e.sh

@@ -0,0 +1,283 @@
+#!/bin/bash
+
+# 1e is as 1d but instead of the --proportional-shrink option, using


if you don't have letters a through d in this PR, please rename to 1a.

danpovey · 2018-03-07T22:03:56Z

egs/chime5/s5/local/chain/tuning/run_tdnn_1e.sh

+ num_targets=$(tree-info $tree_dir/tree |grep num-pdfs|awk '{print $2}')
+ learning_rate_factor=$(echo "print 0.5/$xent_regularize" | python)
+ opts="l2-regularize=0.05"
+ output_opts="l2-regularize=0.01"


You may find that adding "bottleneck-dim=320" (or maybe 256) to output_opts helps.

I just put "bottleneck-dim=320" in the current script for now (I used it in my old setup, but it was removed during some merge steps). It will be tuned later.

danpovey · 2018-03-08T01:30:45Z

egs/chime5/s5/local/prepare_data.sh

+adir=$1
+jdir=$2
+dir=$3
+


please add some basic check of the inputs, so the messages are informative if the user gives the wrong inputs.

danpovey · 2018-03-08T01:34:26Z

egs/chime5/s5/local/train_lms_srilm.sh

+ echo "-------------------"
+ echo "Maxent 3grams"
+ echo "-------------------"
+ sed 's/'${oov_symbol}'/<unk>/g' $tgtdir/train.txt | \


are these needed? I notice you're not treating it as an error if LIBLBFGS is not defined. It seems to me, either it's important (in which case it should be an error if not defined), or it's not (in which case this could be deleted).

If you want to have a script that tries a bunch of LMs automatically, like this, then IMO this shouldn't be in local/, it should be a generic script called from local.

That script is fairly general because there was a lot of experimentation with the training set size and yet, it's not general enough, as it contains some chime5 specific filtering (as it is probable, that the training set could contain duplicate transcriptions of the same utterance (via a different channel). IMO the MaxEnt models are usually the most robust way for arpa-format LM -- you don't have to mess with discounting and cut-offs, but they are not very widely accepted, so I wanted to provide a comparison against KN and GT baselines.

@jtrmal, for these max-ent LMs, are they the ones that are actually chosen by the script?
I believe they are quite slow to estimate, and I'm not sure how they behave when pruned.
I'd prefer to get rid of the dependency if there isn't really a good reason to use them.

ShigekiKarita · 2018-03-08T08:27:32Z

egs/chime5/s5/run.sh

+# chime5 main directory path
+# please change the path accordingly
+chime5_corpus=/export/corpora4/CHiME5
+json_dir=${chime5_corpus}/data/transcriptions


maybe json_dir=${chime5_corpus}/transcriptions
there is no data dir in unzipped CHiME5 dir

ShigekiKarita · 2018-03-08T08:27:48Z

egs/chime5/s5/run.sh

+# please change the path accordingly
+chime5_corpus=/export/corpora4/CHiME5
+json_dir=${chime5_corpus}/data/transcriptions
+audio_dir=${chime5_corpus}/data/audio


jtrmal · 2018-03-08T10:41:54Z

@danpovey @ShigekiKarita @sw005320 I'm addressing the comments, will push today -- I want to run the data preparation pipeline to make sure it works with the new corpus location

sw005320 · 2018-03-08T13:54:15Z

@jtrmal @ShigekiKarita, I just fixed wrong path issues. Now I confirmed that it's working by stage 16.

jtrmal · 2018-03-09T09:36:57Z

The non-addressed comments are related on TDNN/chain script, shinji is still waiting for his training to finish.

kamo-naoyuki

Don't you have to use cuda_cmd?

sw005320 · 2018-03-09T15:39:03Z

@kamo-naoyuki which part are you talking about?

kamo-naoyuki · 2018-03-09T15:57:49Z

egs/chime5/s5/local/chain/tuning/run_tdnn_1e.sh

+ fi
+
+ steps/nnet3/chain/train.py --stage=$train_stage \
+ --cmd="$decode_cmd" \


Sorry, here. $decode_cmd should be replace to $cuda_cmd?

jtrmal · 2018-03-09T16:38:31Z

No, it shouldn't be cuda_cmd Y.

…

On Fri, Mar 9, 2018, 16:57 kamo-naoyuki ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In egs/chime5/s5/local/chain/tuning/run_tdnn_1e.sh <#2262 (comment)>: > + # similar in the xent and regular final layers. + relu-batchnorm-layer name=prefinal-xent input=tdnn8 $opts dim=512 target-rms=0.5 + output-layer name=output-xent $output_opts dim=$num_targets learning-rate-factor=$learning_rate_factor max-change=1.5 +EOF + steps/nnet3/xconfig_to_configs.py --xconfig-file $dir/configs/network.xconfig --config-dir $dir/configs/ +fi + + +if [ $stage -le 14 ]; then + if [[ $(hostname -f) == *.clsp.jhu.edu ]] && [ ! -d $dir/egs/storage ]; then + utils/create_split_dir.pl \ + /export/b0{3,4,5,6}/$USER/kaldi-data/egs/chime5-$(date +'%m_%d_%H_%M')/s5/$dir/egs/storage $dir/egs/storage + fi + + steps/nnet3/chain/train.py --stage=$train_stage \ + --cmd="$decode_cmd" \ Sorry, here. $decode_cmd should be replace to $cuda_cmd? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2262 (review)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AKisX6bHPQaOOU0Pjs4vZkc04hUG2gVuks5tcqaFgaJpZM4Sg9M0> .

jtrmal · 2018-03-09T16:41:26Z

Cuda_cmd was used for training Karels dnn, now the cuda during training is used in different way.

…

On Fri, Mar 9, 2018, 17:38 Jan Trmal ***@***.***> wrote: No, it shouldn't be cuda_cmd Y. On Fri, Mar 9, 2018, 16:57 kamo-naoyuki ***@***.***> wrote: > ***@***.**** commented on this pull request. > ------------------------------ > > In egs/chime5/s5/local/chain/tuning/run_tdnn_1e.sh > <#2262 (comment)>: > > > + # similar in the xent and regular final layers. > + relu-batchnorm-layer name=prefinal-xent input=tdnn8 $opts dim=512 target-rms=0.5 > + output-layer name=output-xent $output_opts dim=$num_targets learning-rate-factor=$learning_rate_factor max-change=1.5 > +EOF > + steps/nnet3/xconfig_to_configs.py --xconfig-file $dir/configs/network.xconfig --config-dir $dir/configs/ > +fi > + > + > +if [ $stage -le 14 ]; then > + if [[ $(hostname -f) == *.clsp.jhu.edu ]] && [ ! -d $dir/egs/storage ]; then > + utils/create_split_dir.pl \ > + /export/b0{3,4,5,6}/$USER/kaldi-data/egs/chime5-$(date +'%m_%d_%H_%M')/s5/$dir/egs/storage $dir/egs/storage > + fi > + > + steps/nnet3/chain/train.py --stage=$train_stage \ > + --cmd="$decode_cmd" \ > > Sorry, here. $decode_cmd should be replace to $cuda_cmd? > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <#2262 (review)>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/AKisX6bHPQaOOU0Pjs4vZkc04hUG2gVuks5tcqaFgaJpZM4Sg9M0> > . >

…nd right channel information according to the other channel information format

kamo-naoyuki · 2018-03-10T03:25:09Z

ok, I understood.

jtrmal · 2018-03-10T22:12:40Z

Yes, those are typically the ones used. The gain on the WER for this dataset is around 0.2 % abs. Also, I'm using it fairly commonly (for babel and others) and didn't run into any problem. But given the fact CHiME organizers' plan is to get is this merged as soon as possible, I'm ok with deleting it. y.

…

On Sat, Mar 10, 2018 at 10:04 PM, Daniel Povey ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In egs/chime5/s5/local/train_lms_srilm.sh <#2262 (comment)>: > + -kndiscount1 -gt1min 0 -kndiscount2 -gt2min 1 -kndiscount3 -gt3min 2 -kndiscount4 -gt4min 3 -order 4 \ + -text $tgtdir/train.txt -vocab $tgtdir/vocab -unk -sort -map-unk "$oov_symbol" +ngram-count -lm $tgtdir/4gram.kn0222.gz \ + -kndiscount1 -gt1min 0 -kndiscount2 -gt2min 2 -kndiscount3 -gt3min 2 -kndiscount4 -gt4min 2 -order 4 \ + -text $tgtdir/train.txt -vocab $tgtdir/vocab -unk -sort -map-unk "$oov_symbol" +ngram-count -lm $tgtdir/4gram.kn0223.gz \ + -kndiscount1 -gt1min 0 -kndiscount2 -gt2min 2 -kndiscount3 -gt3min 2 -kndiscount4 -gt4min 3 -order 4 \ + -text $tgtdir/train.txt -vocab $tgtdir/vocab -unk -sort -map-unk "$oov_symbol" + +if [ ! -z ${LIBLBFGS} ]; then + #please note that if the switch -map-unk "$oov_symbol" is used with -maxent-convert-to-arpa, ngram-count will segfault + #instead of that, we simply output the model in the maxent format and convert it using the "ngram" + echo "-------------------" + echo "Maxent 3grams" + echo "-------------------" + sed 's/'${oov_symbol}'/<unk>/g' $tgtdir/train.txt | \ @jtrmal <https://github.com/jtrmal>, for these max-ent LMs, are they the ones that are actually chosen by the script? I believe they are quite slow to estimate, and I'm not sure how they behave when pruned. I'd prefer to get rid of the dependency if there isn't really a good reason to use them. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2262 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AKisXwk_rZzCGqKXOBg6PTXSxu8M2IRkks5tdD_DgaJpZM4Sg9M0> .

danpovey · 2018-03-10T22:13:25Z

it's OK, you can keep it if it helps.

…

On Sat, Mar 10, 2018 at 5:12 PM, jtrmal ***@***.***> wrote: Yes, those are typically the ones used. The gain on the WER for this dataset is around 0.2 % abs. Also, I'm using it fairly commonly (for babel and others) and didn't run into any problem. But given the fact CHiME organizers' plan is to get is this merged as soon as possible, I'm ok with deleting it. y. On Sat, Mar 10, 2018 at 10:04 PM, Daniel Povey ***@***.***> wrote: > ***@***.**** commented on this pull request. > ------------------------------ > > In egs/chime5/s5/local/train_lms_srilm.sh > <#2262 (comment)>: > > > + -kndiscount1 -gt1min 0 -kndiscount2 -gt2min 1 -kndiscount3 -gt3min 2 -kndiscount4 -gt4min 3 -order 4 \ > + -text $tgtdir/train.txt -vocab $tgtdir/vocab -unk -sort -map-unk "$oov_symbol" > +ngram-count -lm $tgtdir/4gram.kn0222.gz \ > + -kndiscount1 -gt1min 0 -kndiscount2 -gt2min 2 -kndiscount3 -gt3min 2 -kndiscount4 -gt4min 2 -order 4 \ > + -text $tgtdir/train.txt -vocab $tgtdir/vocab -unk -sort -map-unk "$oov_symbol" > +ngram-count -lm $tgtdir/4gram.kn0223.gz \ > + -kndiscount1 -gt1min 0 -kndiscount2 -gt2min 2 -kndiscount3 -gt3min 2 -kndiscount4 -gt4min 3 -order 4 \ > + -text $tgtdir/train.txt -vocab $tgtdir/vocab -unk -sort -map-unk "$oov_symbol" > + > +if [ ! -z ${LIBLBFGS} ]; then > + #please note that if the switch -map-unk "$oov_symbol" is used with -maxent-convert-to-arpa, ngram-count will segfault > + #instead of that, we simply output the model in the maxent format and convert it using the "ngram" > + echo "-------------------" > + echo "Maxent 3grams" > + echo "-------------------" > + sed 's/'${oov_symbol}'/<unk>/g' $tgtdir/train.txt | \ > > @jtrmal <https://github.com/jtrmal>, for these max-ent LMs, are they the > ones that are actually chosen by the script? > I believe they are quite slow to estimate, and I'm not sure how they > behave when pruned. > I'd prefer to get rid of the dependency if there isn't really a good > reason to use them. > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <#2262 (comment)>, or mute > the thread > <https://github.com/notifications/unsubscribe-auth/AKisXwk_ rZzCGqKXOBg6PTXSxu8M2IRkks5tdD_DgaJpZM4Sg9M0> > . > — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2262 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADJVu1kG8drsqLoETio1al1bF7X6bcKxks5tdE_egaJpZM4Sg9M0> .

sw005320 · 2018-03-10T23:09:23Z

@jtrmal @danpovey I'm just reporting the current status. I removed stage 13 (lexicon update) to strictly follow the challenge regulation, and also added a location tag for future scoring. Now, I'm checking whether the recipe is working from scratch. Now it's working by the data cleaning stage, and will move to the chain model. I already confirmed it's working in the previous setup, and I think the check will be smoothly finished in the weekend. (I hope. I believe)

…o chime5_baseline

sw005320 · 2018-03-13T13:17:44Z

@danpovey we've finished the recipe check, and updated the latest results. If this is no problem, please merge it.

sw005320 · 2018-03-13T18:41:22Z

Thanks!!!!

jtrmal and others added 2 commits March 7, 2018 13:37

Adding chime5 baseline recipe

a46da4e

[egs] fixed a data path and bug in data prep at chime5

b09ee9c

danpovey reviewed Mar 8, 2018

View reviewed changes

ShigekiKarita reviewed Mar 8, 2018

View reviewed changes

[egs] fixed a wrong directory name in data/lang

6c69189

fixing some issues

5a3b8b1

jtrmal force-pushed the chime5_baseline branch from 73333e3 to 5a3b8b1 Compare March 8, 2018 16:11

kamo-naoyuki suggested changes Mar 9, 2018

View reviewed changes

[egs] fixed chain related path issues and reflected Dan's comments

196cb77

kamo-naoyuki reviewed Mar 9, 2018

View reviewed changes

[egs] added location tags for future scoring. also changed the left a…

8797065

…nd right channel information according to the other channel information format

forward the test set names to ivector_common

d5d93fe

sw005320 added 4 commits March 10, 2018 18:21

[egs] removed lexicon update

ffbe47e

Merge branch 'chime5_baseline' of https://github.com/jtrmal/kaldi int…

9d982a7

…o chime5_baseline

[egs] merge Yenda's update

9b936ca

[egs] added RESULTS

bc2f8f6

danpovey merged commit 5eb57cc into kaldi-asr:master Mar 13, 2018

LvHang pushed a commit to LvHang/kaldi that referenced this pull request Apr 14, 2018

[egs] Add chime5 baseline recipe (kaldi-asr#2262)

f131da1

Skaiste pushed a commit to Skaiste/idlak that referenced this pull request Sep 26, 2018

[egs] Add chime5 baseline recipe (kaldi-asr#2262)

28840b7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adding chime5 baseline recipe #2262

Adding chime5 baseline recipe #2262

Uh oh!

jtrmal commented Mar 7, 2018

danpovey left a comment

danpovey Mar 7, 2018

danpovey Mar 7, 2018

danpovey Mar 7, 2018

sw005320 Mar 9, 2018

danpovey Mar 7, 2018

sw005320 Mar 9, 2018

danpovey Mar 8, 2018

danpovey Mar 8, 2018

jtrmal Mar 9, 2018

danpovey Mar 10, 2018

ShigekiKarita Mar 8, 2018

sw005320 Mar 8, 2018

ShigekiKarita Mar 8, 2018

sw005320 Mar 8, 2018

jtrmal commented Mar 8, 2018

sw005320 commented Mar 8, 2018 •

edited

Loading

jtrmal commented Mar 9, 2018

kamo-naoyuki left a comment •

edited

Loading

sw005320 commented Mar 9, 2018

kamo-naoyuki Mar 9, 2018

jtrmal commented Mar 9, 2018 via email

jtrmal commented Mar 9, 2018 via email

kamo-naoyuki commented Mar 10, 2018

jtrmal commented Mar 10, 2018 via email

danpovey commented Mar 10, 2018 via email

sw005320 commented Mar 10, 2018

sw005320 commented Mar 13, 2018

sw005320 commented Mar 13, 2018

Labels

5 participants

		@@ -0,0 +1,50 @@
		#BeamformIt sample configuration file for AMI data (http://groups.inf.ed.ac.uk/ami/download/)

		@@ -0,0 +1,2 @@
		beam=11.0 # beam for decoding. Was 13.0 in the scripts.

		@@ -0,0 +1,283 @@
		#!/bin/bash

		# 1e is as 1d but instead of the --proportional-shrink option, using

Adding chime5 baseline recipe #2262

Adding chime5 baseline recipe #2262

Uh oh!

Conversation

jtrmal commented Mar 7, 2018

danpovey left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jtrmal commented Mar 8, 2018

sw005320 commented Mar 8, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

jtrmal commented Mar 9, 2018

kamo-naoyuki left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

sw005320 commented Mar 9, 2018

Choose a reason for hiding this comment

jtrmal commented Mar 9, 2018 via email

jtrmal commented Mar 9, 2018 via email

kamo-naoyuki commented Mar 10, 2018

jtrmal commented Mar 10, 2018 via email

danpovey commented Mar 10, 2018 via email

sw005320 commented Mar 10, 2018

sw005320 commented Mar 13, 2018

sw005320 commented Mar 13, 2018

Labels

5 participants

sw005320 commented Mar 8, 2018 •

edited

Loading

kamo-naoyuki left a comment •

edited

Loading