Utterance-Wise Recurrent Dropout And Iterative Speaker Adaptation For Robust Monaural Speech Recognition

单声道语音识别的逐句循环Dropout迭代说话人自适应

WRBN（wide residual BLSTM network，宽残差双向长短时记忆网络）

[2] J. Heymann, L. Drude, and R. Haeb-Umbach, "Wide residual blstm network with discriminative speaker adaptation for robust speech recognition," submitted to the CHiME, vol. 4, 2016.

reverberation，n. [声] 混响；反射；反响；回响

CLDNN（convolutional, long short-term memory, fully connected deep neural networks，卷积-长短时记忆-全连接深度神经网络）

[1] T.N. Sainath, O. Vinyals, A. Senior, and H. Sak, "Convolutional, long short-term memory, fully connected deep neural networks," in Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on. IEEE, 2015, pp. 4580–4584.

speech separation，语音分离，将多说话人同时说话的语句分离为各个说话人独立说话的语句。

在LSTM训练中使用Dropout能有效缓解过拟合。

[3] G.E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R.R. Salakhutdinov, "Improving neural networks by preventing co-adaptation of feature detectors," arXiv preprint arXiv:1207.0580, 2012.

在输出门、遗忘门以及输入门使用基于语句采样丢帧Mask能取得最优结果（Cheng dropout）。

[7] G. Cheng, V. Peddinti, D. Povey, V. Manohar, S. Khudanpur, and Y. Yan, "An exploration of dropout with lstms," in Proceedings of Interspeech, 2017.

基于MLLR的迭代自适应方法，使用上一次迭代的解码结果来更新高斯参数。

, vol. 2, pp. 1133–1136.

近期提出了一种batch正则化说话人自适应。

[14] P. Swietojanski, J. Li, and S. Renals, "Learning hidden unit contributions for unsupervised acoustic model adaptation," IEEE/ACMTransactionsonAudio,Speech, and Language Processing, vol. 24, no. 8, pp. 1450– 1463, 2016.

本文使用了无监督的LIN说话人自适应

[11]

使用的LIN层矩阵维数为80*80，该层被三个输入特征共享（原始、delta、delta-delta）。

本文尝试使用以下两种方式进行迭代的说话人自适应：

在迭代时使用上一次迭代的模型生成新标签进行训练。
每次迭代堆叠一个额外的线性输入层（数学上，多个线性层相当于一个隐层）

传统DNN训练方式是segment-wise

实验得出，使用RNN时，Iter（迭代方案）更优；使用tri-gram时，Stack（堆叠）方案更优

Utterance-Wise Recurrent Dropout And Iterative Speaker Adaptation For Robust Monaural Speech Recognition的更多相关文章

A Bayesian Approach to Deep Neural Network Adaptation with Applications to Robust Automatic Speech Recognition
基于贝叶斯的深度神经网络自适应及其在鲁棒自动语音识别中的应用直接贝叶斯DNN自适应使用高斯先验对DNN进行MAP自适应为何贝叶斯在模型自适应中很有用? 因为自适应问题可以视为后验估计问题 ...
Empirical Evaluation of Speaker Adaptation on DNN based Acoustic Model
DNN声学模型说话人自适应的经验性评估年3月27日发表于:Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Pro ...
(zhuan) Attention in Long Short-Term Memory Recurrent Neural Networks
Attention in Long Short-Term Memory Recurrent Neural Networks by Jason Brownlee on June 30, 2017 in ...
论文翻译：2020_FLGCNN: A novel fully convolutional neural network for end-to-end monaural speech enhancement with utterance-based objective functions
论文地址:FLGCNN:一种新颖的全卷积神经网络,用于基于话语的目标函数的端到端单耳语音增强论文代码:https://github.com/LXP-Never/FLGCCRN(非官方复现) 引用格式 ...
Recurrent Neural Network[survey]
0.引言我们发现传统的(如前向网络等)非循环的NN都是假设样本之间无依赖关系(至少时间和顺序上是无依赖关系),而许多学习任务却都涉及到处理序列数据,如image captioning,speech ...
(zhuan) Recurrent Neural Network
Recurrent Neural Network 2016年07月01日 Deep learning Deep learning 字数:24235 this blog from: http:/ ...
深度自适应增量学习（Incremental Learning Through Deep Adaptation）
深度自适应增量学习(Incremental Learning Through Deep Adaptation) 2018-05-25 18:56:00 木呆呆瓶子阅读数 10564 收藏更多分 ...
Text Prompted Remote Speaker Authentication : Joint Speech and Speaker Recognition/Verification System :: Major Project ::: Introduction
转载自:http://ganeshtiwaridotcomdotnp.blogspot.com/2010/12/text-prompted-remote-speaker.html Biometrics ...
论文翻译：2020_WaveCRN: An efficient convolutional recurrent neural network for end-to-end speech enhancement
论文地址:用于端到端语音增强的卷积递归神经网络论文代码:https://github.com/aleXiehta/WaveCRN 引用格式:Hsieh T A, Wang H M, Lu X, et ...

随机推荐

SRM 600 div 2 T 1
贪心+枚举 #include <bits/stdc++.h> using namespace std; class TheShuttles { public: int getLeast ...
poj2259 Team Queue
吼哇,又是水题. 我本来准备开1010个queue的,但是STL容器里好像只有vector滋磁开组,于是只好数组模拟... 然后模拟过了...... #include <cstdio> # ...
Windows android SDK环境配置及判断安装成功
python 学习笔记：python例子
廖雪峰python网站 #if els # -*- coding: utf-8 -*- #list是一种有序的集合,可以随时添加和删除其中的元素. ''' classmates=['a','b','c ...
text-overflow文本溢出隐藏“...”显示
一.文本溢出省略号显示 1.文本溢出是否“...”显示属性:text-overflow:clip(不显示省略标记)/ellipsis(文本溢出时“...”显示) 定义此属性有四个必要条件:1)须有容器 ...
HTML学习笔记Day1
一.网站建设流程 1.注册域名(网址) 2.租用空间(服务器) 3.网站建设 (1)确定网站主题(项目) (2)搜集材料(项目) (3)规划网站(UI) (4)制作页面(前端--后端) 4.网络推广 ...
异步请求 ajax的使用详解
https://blog.csdn.net/happyaliceyu/article/details/52381446 可以说是很详细了,赞
CodeForces5E 环转链,dp思想
http://codeforces.com/problemset/problem/5/E 众所周知,在很久以前,在今天的 Berland 地区,居住着 Bindian 部落.他们的首都被 n 座山所环 ...
oracle中查看所有表和字段以及表注释字段注释
获取表:select table_name from user_tables; //当前用户拥有的表 select table_name from all_tables; //所有用户的表 selec ...
a标签与js的冲突
如上图,需要做一个页面,点击左边的标题,右边就显示左边标题下的子标题的集合, html代码如下: <div id="newleft"> <ul> <l ...

Utterance-Wise Recurrent Dropout And Iterative Speaker Adaptation For Robust Monaural Speech Recognition

Utterance-Wise Recurrent Dropout And Iterative Speaker Adaptation For Robust Monaural Speech Recognition的更多相关文章

随机推荐

热门专题