Espnet ASR Demo & Quantization Document
- This is a document of how to run Espnet (v1) ASR Demo and its model quantization
- Test enviroment:
| Ubuntu | CUDA | GCC |
|---|---|---|
| 21.04 | 11.6 | 11.2 |
Installation
Note: Please follow the original installation guide provided by Espnet. Only some notes below should be paid attention to.
Requirements
| sox | sndfile | ffmpeg | flac |
|---|---|---|---|
| installed | installed | not installed | not installed |
Install Kaldi
Exactly follow the installation guide
Notes:
- The Kaldi installation includes two parts: 1. tools installation 2. src installation. Make sure install them all in order
- Once installed, many
.obinary files can be found in directories such as:<kaldi-root>\{featbin,fgmmbin,fstbin,etc.}
Install Espnet
Exactly follow the installation guide
Notes:
- Kaldi should be linked into
<espnet>/tools(check guide) Option A) Setup Anaconda environmentis choosen in this document, so a virtual enviromentespnetis created withpython==3.8- Since the current CUDA version is 11.6, which is not compatible with pytorch 1.10.1, so
espnetshould be installed by$ make TH_VERSION=1.10.1 CUDA_VERSION=11.3, which specifies the version pytorch and CUDA - Custom tools in
[Optional] Custom tool installationare not installed - install chainer in the
espnetconda enviroment bypip install chainer==6.0.0(cupyis not installed due to some errors)
Run ASR Demo
Notes: some
- Prepare the audio file
eg. thetest.wavfile inespnet/utils
Put the.wavfile inespnet/egs/tedlium2/asr1 - Perform decoding
a.cd espnet/egs/tedlium2/asr1andsource ./path.sh
b.recog_wav.sh --models <downloaded-model> test.wav
Notes: The default approach is to usegodownpackage, which could cause a time out error due to the network disconnection. In this case, the model file, eg.model.streaming.v1.tar.gz, need to be downloaded manually from google drive (see Espnet readme)
Then, modify thedownload_from_google_drive.shfile inespnet/utilsdirectory as follows:
a. create a variablemanual_download_dirthat specifies the path of the downloaded model file. eg.manual_download_dir="/home/glinttsd/espnet/egs/tedlium2/asr1/model.streaming.v1.tar.gz"
b. replace the codes in line 46-47 with
which skips the download part and decompress the model file directly.if [ -f "$manual_download_dir" ] then echo "File download locally" decompress "${manual_download_dir}" "${download_dir}" else echo "File download from url: ${share_url}" gdown --id "${file_id}" -O "${tmp}" decompress "${tmp}" "${download_dir}" fi
Model Quantization
Espnet provides dynamic quantization method through pytorch API.
To enable dynamic quantization, add the following codes in espnet/utils/recog_wav.sh file line 248-249
--quantize-asr-model True \
--quantize-dtype "qint8" \
Now we can perform decoding as described in the last section
More usage can be found here
