recipe for GALE Arabic speech translation project #3845

DongjiGao · 2020-01-14T00:17:32Z

This is for the ASR part of the speech translation project. It is similar to the current GALE Arabic recipe but with more data for acoustic and language model training.

danpovey · 2020-01-14T00:36:17Z

Is any of the translation stuff somehow baked into this script, i.e. are there aspects that are specific translation? If not I might want you to call it s5c or whatever the next number is, but you should clarify how it differs. E.g. did we do something with subwords in s5b? And how do the WERs differ?

DongjiGao · 2020-01-14T01:01:06Z

Is any of the translation stuff somehow baked into this script, i.e. are there aspects that are specific translation? If not I might want you to call it s5c or whatever the next number is, but you should clarify how it differs. E.g. did we do something with subwords in s5b? And how do the WERs differ?

There's no specific translation stuff. So I think I should call it s5d (s5b use graphemic lexicon and s5c is the subword version of s5b).
s5c (subword) is slightly worse than s5b.
s5d use more training data from GALE than s5b and the WER decrease from 16.47 to 14.89.
I also run a subword version of s5d, it is slightly worse than the word system after RNNLM-rescoring. So I did not check that in.

danpovey · 2020-01-14T04:43:23Z

Looks OK to me. Do you think it needs testing before merge?

DongjiGao · 2020-01-14T05:17:29Z

Looks OK to me. Do you think it needs testing before merge?

I have already done some tests. I think it's good to be merged.

danpovey · 2020-01-14T05:37:40Z

Thanks a lot!! Merging.

Dongji Gao added 2 commits January 13, 2020 18:57

add recipe of GALE Arabic speech translation project

c755259

delete some unnecessary files

2f3cfeb

Dongji Gao added 3 commits January 13, 2020 21:34

move recipe to gale_arabic/s5d

c33bc7a

delete original files

8201dd5

small fix

aa68a38

Dongji Gao added 2 commits January 14, 2020 00:05

small fix

327c2fa

small fix

0f9ff04

danpovey merged commit f196922 into kaldi-asr:master Jan 14, 2020

wonkyuml pushed a commit to wonkyuml/kaldi that referenced this pull request Jan 16, 2020

[egs] recipe for GALE Arabic speech translation project (kaldi-asr#3845)

c7ff5c2

Bar-BY pushed a commit to Bar-BY/kaldi that referenced this pull request Jan 21, 2020

[egs] recipe for GALE Arabic speech translation project (kaldi-asr#3845)

36f8201

galv pushed a commit to galv/kaldi that referenced this pull request Dec 10, 2022

[egs] recipe for GALE Arabic speech translation project (kaldi-asr#3845)

04e07d2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

recipe for GALE Arabic speech translation project #3845

recipe for GALE Arabic speech translation project #3845

Uh oh!

DongjiGao commented Jan 14, 2020 •

edited

Loading

danpovey commented Jan 14, 2020

DongjiGao commented Jan 14, 2020 •

edited

Loading

danpovey commented Jan 14, 2020

DongjiGao commented Jan 14, 2020

danpovey commented Jan 14, 2020

Labels

2 participants

recipe for GALE Arabic speech translation project #3845

recipe for GALE Arabic speech translation project #3845

Uh oh!

Conversation

DongjiGao commented Jan 14, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

danpovey commented Jan 14, 2020

DongjiGao commented Jan 14, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

danpovey commented Jan 14, 2020

DongjiGao commented Jan 14, 2020

danpovey commented Jan 14, 2020

Labels

2 participants

DongjiGao commented Jan 14, 2020 •

edited

Loading

DongjiGao commented Jan 14, 2020 •

edited

Loading