2 stage 1-數(shù)據(jù)準(zhǔn)備
if [ $stage -le 1 ]; then
? # format the data as Kaldi data directories
? for part in dev-clean-2 train-clean-5; do
? ? # use underscore-separated names in data directories.
? ? local/data_prep.sh $data/LibriSpeech/$part data/$(echo $part | sed s/-/_/g)
? done
? local/prepare_dict.sh --stage 3 --nj 30 --cmd "$train_cmd" \
? ? data/local/lm data/local/lm data/local/dict_nosp
? utils/prepare_lang.sh data/local/dict_nosp \
? ? "<UNK>" data/local/lang_tmp_nosp data/lang_nosp
? local/format_lms.sh --src-dir data/lang_nosp data/local/lm
? # Create ConstArpaLm format language model for full 3-gram and 4-gram LMs
? utils/build_const_arpa_lm.sh data/local/lm/lm_tglarge.arpa.gz \
? ? data/lang_nosp data/lang_nosp_test_tglarge
fi

源代碼如上,主要的幾個(gè)sh,分別以子文章的方式發(fā)出