Utterance-Wise Recurrent Dropout And Iterative Speaker Adaptation For Robust Monaural Speech Recognition
单声道语音识别的逐句循环Dropout迭代说话人自适应
WRBN(wide residual BLSTM network,宽残差双向长短时记忆网络)
[2] J. Heymann, L. Drude, and R. Haeb-Umbach, "Wide residual blstm network with discriminative speaker adaptation for robust speech recognition," submitted to the CHiME, vol. 4, 2016.
reverberation,n. [声] 混响;反射;反响;回响
CLDNN(convolutional, long short-term memory, fully connected deep neural networks,卷积-长短时记忆-全连接深度神经网络)
[1] T.N. Sainath, O. Vinyals, A. Senior, and H. Sak, "Convolutional, long short-term memory, fully connected deep neural networks," in Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on. IEEE, 2015, pp. 4580–4584.
speech separation,语音分离,将多说话人同时说话的语句分离为各个说话人独立说话的语句。
在LSTM训练中使用Dropout能有效缓解过拟合。
[3] G.E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R.R. Salakhutdinov, "Improving neural networks by preventing co-adaptation of feature detectors," arXiv preprint arXiv:1207.0580, 2012.
在输出门、遗忘门以及输入门使用基于语句采样丢帧Mask能取得最优结果(Cheng dropout)。
[7] G. Cheng, V. Peddinti, D. Povey, V. Manohar, S. Khudanpur, and Y. Yan, "An exploration of dropout with lstms," in Proceedings of Interspeech, 2017.
基于MLLR的迭代自适应方法,使用上一次迭代的解码结果来更新高斯参数。
, vol. 2, pp. 1133–1136.
近期提出了一种batch正则化说话人自适应。
[14] P. Swietojanski, J. Li, and S. Renals, "Learning hidden unit contributions for unsupervised acoustic model adaptation," IEEE/ACMTransactionsonAudio,Speech, and Language Processing, vol. 24, no. 8, pp. 1450– 1463, 2016.
本文使用了无监督的LIN说话人自适应
[11]
使用的LIN层矩阵维数为80*80,该层被三个输入特征共享(原始、delta、delta-delta)。
本文尝试使用以下两种方式进行迭代的说话人自适应:
- 在迭代时使用上一次迭代的模型生成新标签进行训练。
- 每次迭代堆叠一个额外的线性输入层(数学上,多个线性层相当于一个隐层)
传统DNN训练方式是segment-wise
实验得出,使用RNN时,Iter(迭代方案)更优;使用tri-gram时,Stack(堆叠)方案更优
Utterance-Wise Recurrent Dropout And Iterative Speaker Adaptation For Robust Monaural Speech Recognition的更多相关文章
- A Bayesian Approach to Deep Neural Network Adaptation with Applications to Robust Automatic Speech Recognition
基于贝叶斯的深度神经网络自适应及其在鲁棒自动语音识别中的应用 直接贝叶斯DNN自适应 使用高斯先验对DNN进行MAP自适应 为何贝叶斯在模型自适应中很有用? 因为自适应问题可以视为后验估计问题 ...
- Empirical Evaluation of Speaker Adaptation on DNN based Acoustic Model
DNN声学模型说话人自适应的经验性评估 年3月27日 发表于:Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Pro ...
- (zhuan) Attention in Long Short-Term Memory Recurrent Neural Networks
Attention in Long Short-Term Memory Recurrent Neural Networks by Jason Brownlee on June 30, 2017 in ...
- 论文翻译:2020_FLGCNN: A novel fully convolutional neural network for end-to-end monaural speech enhancement with utterance-based objective functions
论文地址:FLGCNN:一种新颖的全卷积神经网络,用于基于话语的目标函数的端到端单耳语音增强 论文代码:https://github.com/LXP-Never/FLGCCRN(非官方复现) 引用格式 ...
- Recurrent Neural Network[survey]
0.引言 我们发现传统的(如前向网络等)非循环的NN都是假设样本之间无依赖关系(至少时间和顺序上是无依赖关系),而许多学习任务却都涉及到处理序列数据,如image captioning,speech ...
- (zhuan) Recurrent Neural Network
Recurrent Neural Network 2016年07月01日 Deep learning Deep learning 字数:24235 this blog from: http:/ ...
- 深度自适应增量学习(Incremental Learning Through Deep Adaptation)
深度自适应增量学习(Incremental Learning Through Deep Adaptation) 2018-05-25 18:56:00 木呆呆瓶子 阅读数 10564 收藏 更多 分 ...
- Text Prompted Remote Speaker Authentication : Joint Speech and Speaker Recognition/Verification System :: Major Project ::: Introduction
转载自:http://ganeshtiwaridotcomdotnp.blogspot.com/2010/12/text-prompted-remote-speaker.html Biometrics ...
- 论文翻译:2020_WaveCRN: An efficient convolutional recurrent neural network for end-to-end speech enhancement
论文地址:用于端到端语音增强的卷积递归神经网络 论文代码:https://github.com/aleXiehta/WaveCRN 引用格式:Hsieh T A, Wang H M, Lu X, et ...
随机推荐
- 【CF1141F1】Same Sum Blocks
题目大意:给定一个 N 个值组成的序列,求序列中区间和相同的不相交区间段数量的最大值. 题解:设 \(dp[i][j]\) 表示到区间 [i,j] 时,与区间 [i,j] 的区间和相同的不相交区间数量 ...
- loj6045 价
题目链接 思路 从源点\(S\)向每种药连一条边权为\(-p+inf\)的边.从每种药向他所需要的药材连一条边权为\(INF\)的边.从每种药材向汇点\(T\)连一条边权为\(inf\)的边. \(I ...
- MVC过滤器处理Session过期
一.自定义一个Action过滤器 public class CheckSession: ActionFilterAttribute { public override void OnActionExe ...
- 第七篇-列表式App:ListActivity及ListView
一.新建一个empty activity的项目. 二.修改MainActivity.java: extends AppCompactActivity改为extends ListActivity.注释掉 ...
- 第三十一节,目标检测算法之 Faster R-CNN算法详解
Ren, Shaoqing, et al. “Faster R-CNN: Towards real-time object detection with region proposal network ...
- app软件遵循的规范
http://www.jianshu.com/p/a2a4c18c1900 https://wenku.baidu.com/view/ecb09b07a4e9856a561252d380eb6294d ...
- js中两种定时器,setTimeout和setInterval的区别
setTimeout只在指定时间后执行一次,代码如下: <script> //定时器 异步运行 function hello(){ alert("hello&qu ...
- mongodb安装和运行
转载来源:https://blog.csdn.net/IT_wanghe/article/details/53884229 参考教程:http://www.runoob.com/mongodb/mon ...
- okhttp 内网可以有,但外网访问数据返不回来,代码一样
:1.问题点在于 下图红框里写成 text/html了,需要改成application/json,造成的问题有:unexpected end of stream 这个是406错误:加上日志之后okh ...
- (贪心 部分背包问题)悼念512汶川大地震遇难同胞——老人是真饿了 hdu2187
悼念512汶川大地震遇难同胞——老人是真饿了 http://acm.hdu.edu.cn/showproblem.php?pid=2187 Time Limit: 1000/1000 MS (Java ...