[转]kaldi上的深度神经网络

转：http://blog.csdn.net/wbgxx333/article/details/41019453

深度神经网络已经是语音识别领域最热的话题了。从2010年开始，许多关于深度神经网络的文章在这个领域发表。许多大型科技公司（谷歌和微软）开始把DNN用到他们的产品系统里。（备注：谷歌的应该是google now，微软的应该是win7和win8操作系统里的语音识别和他的SDK等等）

但是，没有一个工具箱像kaldi这样可以很好的提供支持。因为先进的技术无时无刻不在发展，这就意味着代码需要跟上先进技术的步伐和代码的架构需要重新去思考。

我们现在在kaldi里提供两套分离的关于深度神经网络代码。一个在代码目录下的nnet/和nnetbin/,这个是由 Karel Vesely提供。此外，还有一个在代码目录nnet-cpu/和nnet-cpubin/，这个是由 Daniel Povey提供（这个代码是从Karel早期版本修改，然后重新写的）。这些代码都是很官方的，这些在以后都会发展的。

在例子目录下，比如： egs/wsj/s5/, egs/rm/s5, egs/swbd/s5 and egs/hkust/s5b，神经网络的例子脚本都可以找到。 Karel的例子脚本可以在local/run_dnn.sh或者local/run_nnet.sh，而Dan的例子脚本在local/run_nnet_cpu.sh。在运行这些脚本前，为了调整系统，run.sh你必须首先被运行。

我们会很快的把这两个神经网络的详细文档公布。现在，我们总结下这两个的最重要的区别：

1.Karel的代码，是用GPU加速的单线程的SGD训练，而Dan的代码是用多个CPU的多线程方式；

2.Karel的代码支持区分性训练，而Dan的代码不支持。

除了这些，在架构上有很多细小的区别。

我们希望对于这些库添加更多的文档，Karel的版本的代码有一些稍微过时的文档在Karel's DNN training implementation.

中文翻译见：http://blog.csdn.net/wbgxx333/article/details/24438405

-------------------------------------------------------------------------------------------------------------------------------------------------------

-----------------------------------------------------------------------------------------------------------------------------------------------------

这个是昨晚无意中的发现，由于远在cmu大学的苗亚杰（Yajie Miao）博士的贡献，我们又可以在kaldi上使用深度学习的模块。之前在htk上使用dbn一直还没成功，希望最近可以早点实现。以下是苗亚杰博士的主页上的关于kaldi+pdnn的介绍。希望大家可以把自己的力量也贡献出来，让我们作为学生的多学习学习。

Kaldi+PDNN -- Implementing DNN-based ASR Systems with Kaldi and PDNN

Overview

Kaldi+PDNN contains a set of fully-fledged Kaldi ASR recipes, which realize DNN-based acoustic modeling using the PDNN toolkit. The overall pipeline has 3 stages:

1. The initial GMM model is built with the existing Kaldi recipes

2. DNN acoustic models are trained by PDNN

3. The trained DNN model is ported back to Kaldi for hybrid decoding or further tandem system building

Hightlights

Model diversity. Deep Neural Networks (DNNs); Deep Bottleneck Features (DBNFs); Deep Convolutional Networks (DCNs)

PDNN toolkit. Easy and fast to implement new DNN ideas

Open license. All the codes are released under Apache 2.0, the same license as Kaldi

Consistency with Kaldi. Recipes follow the Kaldi style and can be integrated seemlessly with the existing setups

Release Log

Dec 2013 ---  version 1.0 (the initial release)
Feb 2014 --- version 1.1 (clean up the scripts, add the dnn+fbank recipe run-dnn-fbank.sh, enrich PDNN)

Requirements

1. A GPU card should be available on your computing machine.

2. Initial model building should be run, ideally up to train_sat and align_fmllr

3. Software Requirements:

* Theano. For information about Theano installation on Ubuntu Linux, refer to this document editted by Wonkyum Lee from CMU.
* pfile_utils. This script (that is, kaldi-trunk/tools/install_pfile_utils.sh) installs pfile_utils automatically.

Download

Kaldi+PDNN is hosted on Sourceforge. You can enter your Kaldi Switchboard setup (such as egs/swbd/s5b) and download the latest version via svn:

svn co svn://svn.code.sf.net/p/kaldipdnn/code-0/trunk/pdnn pdnn

svn co svn://svn.code.sf.net/p/kaldipdnn/code-0/trunk/steps_pdnn steps_pdnn

svn co svn://svn.code.sf.net/p/kaldipdnn/code-0/trunk/run_swbd run_swbd

ln -s run_swbd/* ./

Now the new run-*.sh scripts appear in your setup. You can run them directly.

Recipes

run-dnn.sh	DNN hybrid system over fMLLR features
	Targets: context-dependent states from the SAT model exp/tri4a Input: spliced fMLLR features Network: 360:1024:1024:1024:1024:1024:${target_num} Pretraining: pre-training with stacked denoising autoencoders

run-dnn-fbank.sh	DNN hybrid system over filterbank features
	Targets: context-dependent states from the SAT model exp/tri4a Input: spliced log-scale filterbank features with cepstral mean and variance normalization Network: 330:1024:1024:1024:1024:1024:${target_num} Pretraining: pre-training with stacked denoising autoencoders

run-bnf-tandem.sh	GMM Tandem system over Deep Bottleneck features [ reference paper ]
	Targets: BNF network training uses context-dependent states from the SAT model exp/tri4a Input: spliced fMLLR features BNF Network: 360:1024:1024:1024:1024:42:1024:${target_num} Pretraining: pre-training the prior-to-bottleneck layers (360:1024:1024:1024:1024) with stacked denoising autoencoders

run-bnf-dnn.sh	DNN hybrid system over Deep Bottleneck features [ reference paper ]
	BNF network: trained in the same manner as in run-bnf-tandem.sh Hybrid Input: spliced BNF features BNF Network: 378:1024:1024:1024:1024:${target_num} Pretraining: pre-training with stacked denoising autoencoders

run-cnn.sh	Hybrid system based on deep convolutional networks (DCNs) [ reference paper ]
	The CNN recipe is not stable. Needs more investigation. Targets: context-dependent states from the SAT model exp/tri4a Input: spliced log-scale filterbank features with cepstral mean and variance normalization; each frame is taken as an input feature map Network: two convolution layers followed by three fully-connected layers. See this page for how to config the network structure. Pretraining: no pre-training is performed for DCNs

Experiments & Results

The recipes are developed based on the Kaldi 110-hour Switchboard setup. This is the standard system you can get if you run egs/swbd/s5b/run.sh. Our experiments follow the similar configurations as described inthis paper. We have the following data partitions. The "validation" set is used to measure frame accuracy and determine termination in DNN fine-tuning.

training -- train_100k_nohup (110 hours)         validation -- train_dev_nohup        testing -- eval2000 (HUB5'00)

Recipes	WER% on HUB5'00-SWB	WER% on HUB5'00
run-dnn.sh	19.3	25.7
run-dnn-fbank.sh	21.4	28.4
run-bnf-tandem.sh	TBA	TBA
run-bnf-dnn.sh	TBA	TBA
run-cnn.sh	TBA	TAB

Our hybrid recipe run-dnn.sh is giving WER comparable with this paper (Table 5 for fMLLR features). We are confident to think that our recipes perform comparably with the Kaldi internal DNN setups.

Want to Contribute?

We look forward to your contributions. Improvement can be made on the following aspects (but not limited to):

1. Optimization to the above recipes
2 New recipes
3. Porting the recipes to other datasets
4. Experiments and results
5. Contributions to the PDNN toolkit

Contact Yajie Miao (ymiao@cs.cmu.edu) if you have any questions or suggestions.

上述就是苗博士的介绍。具体可见：http://www.cs.cmu.edu/~ymiao/kaldipdnn.html。

有些复杂，后续有时间深入再看。

[转]kaldi上的深度神经网络的更多相关文章

云中的机器学习：FPGA 上的深度神经网络
人工智能正在经历一场变革,这要得益于机器学习的快速进步.在机器学习领域,人们正对一类名为“深度学习”算法产生浓厚的兴趣,因为这类算法具有出色的大数据集性能.在深度学习中,机器可以在监督或不受监督的方式 ...
深度神经网络DNN的多GPU数据并行框架及其在语音识别的应用
深度神经网络(Deep Neural Networks, 简称DNN)是近年来机器学习领域中的研究热点,产生了广泛的应用.DNN具有深层结构.数千万参数需要学习,导致训练非常耗时.GPU有强大的计算能 ...
CNN(卷积神经网络)、RNN(循环神经网络)、DNN(深度神经网络)的内部网络结构有什么区别？
https://www.zhihu.com/question/34681168 CNN(卷积神经网络).RNN(循环神经网络).DNN(深度神经网络)的内部网络结构有什么区别?修改 CNN(卷积神经网 ...
TensorFlow 深度学习笔记 TensorFlow实现与优化深度神经网络
转载请注明作者:梦里风林 Github工程地址:https://github.com/ahangchen/GDLnotes 欢迎star,有问题可以到Issue区讨论官方教程地址视频/字幕下载全 ...
TensorFlow实现与优化深度神经网络
TensorFlow实现与优化深度神经网络转载请注明作者:梦里风林Github工程地址:https://github.com/ahangchen/GDLnotes欢迎star,有问题可以到Issue ...
如何用70行Java代码实现深度神经网络算法
http://www.tuicool.com/articles/MfYjQfV 如何用70行Java代码实现深度神经网络算法时间 2016-02-18 10:46:17 ITeye 原文 htt ...
深度神经网络（DNN）模型与前向传播算法
深度神经网络(Deep Neural Networks, 以下简称DNN)是深度学习的基础,而要理解DNN,首先我们要理解DNN模型,下面我们就对DNN的模型与前向传播算法做一个总结. 1. 从感知机 ...
深度神经网络（DNN）反向传播算法(BP)
在深度神经网络(DNN)模型与前向传播算法中,我们对DNN的模型和前向传播算法做了总结,这里我们更进一步,对DNN的反向传播算法(Back Propagation,BP)做一个总结. 1. DNN反向 ...
深度神经网络（DNN）损失函数和激活函数的选择
在深度神经网络(DNN)反向传播算法(BP)中,我们对DNN的前向反向传播算法的使用做了总结.里面使用的损失函数是均方差,而激活函数是Sigmoid.实际上DNN可以使用的损失函数和激活函数不少.这些 ...

随机推荐

WEB请求过程(http解析,浏览器缓存机制,域名解析,cdn分发)
概述发起一个http请求的过程就是建立一个socket通信的过程. 我们可以模仿浏览器发起http请求,譬如用httpclient工具包,curl命令等方式. curl "http://w ...
CentOS 7 无法yum安装解决方法
1)下载repo文件 wget http://mirrors.aliyun.com/repo/Centos-7.repo 2)备份并替换系统的repo文件 .repo /etc/yum.repos.d ...
002之MFCSocket异步编程
当今的网络程序通用体系结构大多为C/S模式,服务器监听收到来自客户端的请求,然后响应并作出应答. 界面对话框如下,输入IP信息进行通信后再进行连接,连接成功即可开始通信.左侧为客户端,右侧为服务端. ...
Linux命令简写和全称
alias :Create your own name for a commandcat: Concatenate 串联cd:Change directory 切换目录cp: Copy file 复制 ...
一个ipv4到ipv6的移植问题
之前在使用ipv4的时候,有一个模块是使用raw socket来发包,它使用的一个option是:IP_HDRINCL. 如果设置了IP_HDRINCL选项,则raw会绕过source validat ...
Hibernate 再接触一级缓存二级缓存查询缓存
缓存就是把本来应该放在硬盘里的东西放在内存里将来存内存里读一级缓存: session缓存二级缓存: sessionFactory级别的 (适合经常访问,数据量有限,改动不大) 很多的se ...
cadence 17.2 安装破解
安装包都在gaobo百度云/工具/开发工具或者下载链接进去pcb edit 可能会提示 licese什么的,忽略就可以了.
EasyUI自动消失的弹框
$.messager.show( { title : "系统提示", msg : "请选择提供商!!!" });
使用es6的then()方法封装jquery的ajax请求
使用场景: jsp页面中使用jquery的ajax请求比较频繁,以前vue框架的项目用过axios,所以就想着用then()封装一个公共请求的方法,这样每次请求就不用那么麻烦的写一大堆请求参数了. 示 ...
Jmeter创建一个web测试计划
1. 下载Jmeter 下载地址:http://jmeter.apache.org/download_jmeter.cgi 下载后解压到你想“安装”的路径下,比如: D:\Program Files ...

[转]kaldi上的深度神经网络

[转]kaldi上的深度神经网络的更多相关文章

随机推荐

热门专题