Skip to content

Commit 5cbd2be

Browse files
committed
update README.md
1 parent decd133 commit 5cbd2be

File tree

1 file changed

+32
-0
lines changed

1 file changed

+32
-0
lines changed

README.md

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,38 @@
33
This is a restructured and rewritten version of [bshall/UniversalVocoding](https://github.com/bshall/UniversalVocoding).
44
The main difference here is that the model is turned into a [TorchScript](https://pytorch.org/docs/stable/jit.html) module during training and can be loaded for inferencing anywhere without Python dependencies.
55

6+
### Preprocess training data
7+
8+
Multiple directories containing audio files can be processed at the same time.
9+
10+
```bash
11+
python preprocess.py VCTK-Corpus LibriTTS/train-clean-100 preprocessed
12+
```
13+
14+
### Train from scratch
15+
16+
```bash
17+
python train.py preprocessed preprocessed/metadata.json
18+
```
19+
20+
### Generate waveforms
21+
22+
You can load a trained model anywhere and generate multiple waveforms parallelly.
23+
24+
```python
25+
import torch
26+
27+
vocoder = torch.jit.load("vocoder.pt")
28+
mels = [
29+
torch.randn(100, 80),
30+
torch.randn(200, 80),
31+
torch.randn(300, 80),
32+
]
33+
wavs = vocoder.generate(mels)
34+
```
35+
36+
Emperically, if you're using the default architecture, you can generate 100 samples at the same time on an Nvidia GTX 1080 Ti.
37+
638
### References
739

840
- [Towards achieving robust universal neural vocoding](https://arxiv.org/abs/1811.06292)

0 commit comments

Comments
 (0)