You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -660,10 +660,10 @@ opt_nesterov=False
660
660
661
661
6. Run the experiment with:
662
662
```
663
-
python run_exp.sh cfg/myDNN_exp.cfg
663
+
python run_exp.py cfg/myDNN_exp.cfg
664
664
```
665
665
666
-
7. To debug the model you can first take a look at the standard output. The config file is automatically parsed by the *run_exp.sh* and it raises errors in case of possible problems. You can also take a look into the *log.log* file to see additional information on the possible errors.
666
+
7. To debug the model you can first take a look at the standard output. The config file is automatically parsed by the *run_exp.py* and it raises errors in case of possible problems. You can also take a look into the *log.log* file to see additional information on the possible errors.
667
667
668
668
669
669
When implementing a new model, an important debug test consists of doing an overfitting experiment (to make sure that the model is able to overfit a tiny dataset). If the model is not able to overfit, it means that there is a major bug to solve.
@@ -688,7 +688,7 @@ PyTorch-Kaldi can be used with any speech dataset. To use your own dataset, the
688
688
1. Run the Kaldi recipe with your dataset. Please, see the Kaldi website to have more information on how to perform data preparation.
689
689
2. Compute the alignments on training, validation, and test data.
690
690
3. Write a PyTorch-Kaldi config file *$cfg_file*.
691
-
4. Run the config file with ```python run_exp.sh $cfg_file```.
691
+
4. Run the config file with ```python run_exp.py $cfg_file```.
692
692
693
693
## How can I plug-in my own features
694
694
The current version of PyTorch-Kaldi supports input features stored with the Kaldi ark format. If the user wants to perform experiments with customized features, the latter must be converted into the ark format. Take a look into the Kaldi-io-for-python git repository (https://github.com/vesis84/kaldi-io-for-python) for a detailed description about converting numpy arrays into ark files.
@@ -807,7 +807,7 @@ To use this model for speech recognition on TIMIT, to the following steps:
807
807
2. Save the raw waveform into the Kaldi ark format. To do it, you can use the save_raw_fea.py utility in our repository. The script saves the input signals into a binary Kaldi archive, keeping the alignments with the pre-computed labels. You have to run it for all the data chunks (e.g., train, dev, test). It can also specify the length of the speech chunk (*sig_wlen=200 # ms*) composing each frame.
808
808
3. Open the *cfg/TIMIT_baselines/TIMIT_SincNet_raw.cfg*, change your paths, and run:
4. With this architecture, we have obtained a **PER(%)=17.1%**. A standard CNN fed the same features gives us a **PER(%)=18.%**. Please, see [here](https://bitbucket.org/mravanelli/pytorch-kaldi-exp-timit/src/master/) to take a look into our results. Our results on SincNet outperforms results obtained with MFCCs and FBANKs fed by standard feed-forward networks.
0 commit comments