shithub: opus

Download patch

ref: 153e43aa447d8167f2d70bcc4ea61b82d5b95129
parent: 7fdca7f01dc1c5974b38c0ad77e0174a2c010577
author: Jean-Marc Valin <jmvalin@amazon.com>
date: Thu Oct 21 06:13:45 EDT 2021

update instructions

--- a/dnn/README.md
+++ b/dnn/README.md
@@ -65,13 +65,13 @@
 
 1. Generate training data:
    ```
-   ./dump_data -train input.s16 features.f32 data.u8
+   ./dump_data -train input.s16 features.f32 data.s16
    ```
    where the first file contains 16 kHz 16-bit raw PCM audio (no header) and the other files are output files. This program makes several passes over the data with different filters to generate a large amount of training data.
 
 1. Now that you have your files, train with:
    ```
-   ./training_tf2/train_lpcnet.py features.f32 data.u8 model_name
+   ./training_tf2/train_lpcnet.py features.f32 data.s16 model_name
    ```
    and it will generate an h5 file for each iteration, with model\_name as prefix. If it stops with a
    "Failed to allocate RNN reserve space" message try specifying a smaller --batch-size for  train\_lpcnet.py.
@@ -79,14 +79,13 @@
 1. You can synthesise speech with Python and your GPU card (very slow):
    ```
    ./dump_data -test test_input.s16 test_features.f32
-   ./training_tf2/test_lpcnet.py test_features.f32 test.s16
+   ./training_tf2/test_lpcnet.py lpcnet_model_name.h5 test_features.f32 test.s16
    ```
-   Note the .h5 is hard coded in test_lpcnet.py, modify for your .h5 file.
 
 1. Or with C on a CPU (C inference is much faster):
    First extract the model files nnet_data.h and nnet_data.c
    ```
-   ./dump_lpcnet.py lpcnet15_384_10_G16_64.h5
+   ./dump_lpcnet.py lpcnet_model_name.h5
    ```
    and move the generated nnet_data.* files to the src/ directory.
    Then you just need to rebuild the software and use lpcnet_demo as explained above.
--