shithub: opus

Download patch

ref: ca0a43bee928fffa5057c8f9a46d584324850da5
parent: c1535c8ccfcb31008ce5528deeb2f206f95ce3c9
author: Jean-Marc Valin <jmvalin@jmvalin.ca>
date: Thu Jun 24 13:47:51 EDT 2021

Update README.md

--- a/dnn/README.md
+++ b/dnn/README.md
@@ -4,6 +4,7 @@
 
 - J.-M. Valin, J. Skoglund, [A Real-Time Wideband Neural Vocoder at 1.6 kb/s Using LPCNet](https://jmvalin.ca/papers/lpcnet_codec.pdf), *Submitted for INTERSPEECH 2019*.
 - J.-M. Valin, J. Skoglund, [LPCNet: Improving Neural Speech Synthesis Through Linear Prediction](https://jmvalin.ca/papers/lpcnet_icassp2019.pdf), *Proc. International Conference on Acoustics, Speech and Signal Processing (ICASSP)*, arXiv:1810.11846, 2019.
+- J. Skoglund, J.-M. Valin, [Improving Opus Low Bit Rate Quality with Neural Speech Synthesis](https://jmvalin.ca/papers/opusnet.pdf), *Proc. INTERSPEECH*, arxiv:1905.04628, 2020.
 
 # Introduction
 
@@ -23,7 +24,9 @@
 make
 ```
 Note that the autogen.sh script is used when building from Git and will automatically download the latest model
-(models are too large to put in Git).
+(models are too large to put in Git). By default, LPCNet will attempt to use 8-bit dot product instructions on AVX*/Neon to
+speed up inference. To disable that (e.g. to avoid quantization effects when retraining), add --disable-dot-product to the
+configure script.
 
 It is highly recommended to set the CFLAGS environment variable to enable AVX or NEON *prior* to running configure, otherwise
 no vectorization will take place and the code will be very slow. On a recent x86 CPU, something like
@@ -69,7 +72,7 @@
    and it will generate an lpcnet*.h5 file for each iteration. If it stops with a
    "Failed to allocate RNN reserve space" message try reducing the *batch\_size* variable in train_lpcnet.py.
 
-1. You can synthesise speech with Python and your GPU card:
+1. You can synthesise speech with Python and your GPU card (very slow):
    ```
    ./dump_data -test test_input.s16 test_features.f32
    ./src/test_lpcnet.py test_features.f32 test.s16
@@ -76,7 +79,7 @@
    ```
    Note the .h5 is hard coded in test_lpcnet.py, modify for your .h5 file.
 
-1. Or with C on a CPU:
+1. Or with C on a CPU (C inference is much faster):
    First extract the model files nnet_data.h and nnet_data.c
    ```
    ./dump_lpcnet.py lpcnet15_384_10_G16_64.h5
@@ -95,6 +98,6 @@
 # Reading Further
 
 1. [LPCNet: DSP-Boosted Neural Speech Synthesis](https://people.xiph.org/~jm/demo/lpcnet/)
-1. Sample model files:
-https://jmvalin.ca/misc_stuff/lpcnet_models/
+1. [A Real-Time Wideband Neural Vocoder at 1.6 kb/s Using LPCNet](https://people.xiph.org/~jm/demo/lpcnet_codec/)
+1. Sample model files (check compatibility): https://media.xiph.org/lpcnet/data/ 
 
--