Text Prompted Remote Speaker Authentication : Joint Speech and Speaker Recognition/Verification System

转载自：http://ganeshtiwaridotcomdotnp.blogspot.com/2010/12/text-prompted-remote-speaker.html

Biometrics is, in the simplest definition, something you are. It is a physical characteristic unique to each individual such as fingerprint, retina, iris, speech. Biometrics has a very useful application in security; it can be used to authenticate a person’s identity and control access to a restricted area, based on the premise that the set of these physical characteristics can be used to uniquely identify individuals.

Speech signal conveys two important types of information, the primarily the speech content and on the secondary level, the speaker identity. Speech recognizers aim to extract the lexical information from the speech signal independently of the speaker by reducing the inter-speaker variability. On the other hand, speaker recognition is concerned with extracting the identity of the person speaking the utterance. So both speech recognition and speaker recognition system is possible from same voice input.

　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　Desired Output of the Combined System

Text Prompted Remote Speaker Authentication is a voice biometric system that authenticates a user before permitting the user to log into a system on the basis of the user’s input voice. It is a web application. Voice signal acquisition and feature extraction is done on the client. Training and Authentication task based on the voice feature obtained from client side is done on Server. The authentication task is based on text-prompted version of speaker recognition, which incorporates both speaker recognition and speech recognition. This joint implementation of speech and speaker recognition includes text-independent speaker recognition and speaker-independent speech recognition. Speaker Recognition verifies whether the speaker is claimed one or not while Speech Recognition verifies whether or not spoken word matches the prompted word.
The client side is realized in Adobe Flex whereas the server side is realized in Java. The communication between these two cross-platforms is made possible with the help of Blaze DS’s RPC remote object.

　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　System Architecture

Mel Filter Cepstral Coefficient (MFCC) is used as feature for both speech and speaker recognition task. We also combined energy features and delta and delta-delta features of energy and MFCC. The feature extraction module is same for both speech and speaker recognition. And these recognition systems are implemented independent of each other.

For speaker recognition, GMM(Gaussian Mixture Model) parameters for registered users and a universal background model (UBM) were trained using Expectation Maximization algorithm. Log likelihood ratio between claimed speaker and UBM were compared against threshold to verify the user.

For speech recognition, Codebook is created by k-means clustering of the all feature vector from training speech data. Vector Quantization (VQ) is used to get discrete observation sequence from input feature vector by applying distance metric to Codebook. Left to Right Discrete HMM(Hidden Markov Model) for each word is trained by Baum-Welch algorithm. Viterbi decoding is used for finding best match from the HMM models.

　　　　　　　　　　　　　　　　　　　　　　　　　　Combining speech and speaker recognition systems

Based on the speech model the system decides whether or not the uttered speech matches what was prompted to utter. Similarly, based on the speaker model, the system decides whether or not the speaker is claimed one.

Finally for verification, the score of both speaker and speech recognition is combined to get combined score, to accept or reject the user’s claim.

Text Prompted Remote Speaker Authentication : Joint Speech and Speaker Recognition/Verification System :: Major Project ::: Introduction的更多相关文章

ACL2019: 《GraphRel: Modeling Text as Relational Graphs for Joint Entity and Relation Extraction》源码解析
论文地址:<GraphRel: Modeling Text as Relational Graphs for Joint Entity and Relation Extraction> G ...
Csharp: speech to text, text to speech in win
using System; using System.Collections.Generic; using System.ComponentModel; using System.Data; usin ...
remote: http basic: access denied fatal: authentication failed for '‘解决办法
问题描述由于这个项目代码使用https 进行clone,为什么?因为代码库ssh有问题!fuck! 导致在push代码的时候出现了 remote: http basic: access denied ...
十几个remote control software
5 alternatives to LogMeIn Free for remote PC access VNC VNC, or Virtual Network Computing, isn’t its ...
CMUSphinx Learn - Basic concepts of speech
Basic concepts of speech Speech is a complex phenomenon. People rarely understand how is it produced ...
C#中的System.Speech命名空间初探
本程序是口算两位数乘法,随机生成两个两位数,用语音读出来.然后开启语音识别,接受用户输入,知道答案正确关闭语音识别.用户说答案时,可以说“再说一遍”重复题目. 关键是GrammarBuilder和Ch ...
Using JAAS Authentication in Java Clients---weblogic document
The following topics are covered in this section: JAAS and WebLogic Server JAAS Authentication Devel ...
Install TightVNC Server in RHEL/CentOS and Fedora to Access Remote Desktops
Virtual Networking Computing (VNC) is a Kind of remote sharing system that makes it possible to take ...
Asp.Net Core Authentication Middleware And Generate Token
.mytitle { background: #2B6695; color: white; font-family: "微软雅黑", "宋体", "黑 ...

随机推荐

Python高级编程和异步IO并发编程（笔记）
一.魔法函数 # 例子 class Company(object): def __init__(self, employee_list): self.employee = employee_list ...
JSONObject例子
说起JSON,大家就谈不上陌生了,因为对于数据传输语言,各位只认json,即使有XML语言,但是各位很少用吧.我也是,但是之前用过的json转换工具各种各样,我记忆中有过GSON(google).fa ...
VOJ 1049送给圣诞夜的礼物——矩阵快速幂模板
题意顺次给出 $m$个置换,反复使用这 $m$ 个置换对一个长为 $n$ 初始序列进行操作,问 $k$ 次置换后的序列.$m<=10, k<2^31$. 题目链接分析对序列的置换可表 ...
织梦dedecms会员中心分类管理无法修改、删除分类名
member/mtypes.PHP 文件中添加另外,member/myfriend_group.php文件中也存在同样的问题,也要添加,不添加的话好友分组中也是同样问题
Oracle row_number() over() 分析函数--取出最新数据
语法格式:row_number() over(partition by 分组列 order by 排序列 desc) 一个很简单的例子 1,先做好准备 create table test1( id v ...
loj #6342. 跳一跳期望dp
令 $f[i]$ 表示已经到达 $i$ 点,为了到大 $n$ 点还期望需要的时间,随便转移一下就行. 由于本题卡空间,要记得开滚动数组. #include <bits/stdc++.h> ...
LibreOJ #528. 「LibreOJ β Round #4」求和
二次联通门 : LibreOJ #528. 「LibreOJ β Round #4」求和 /* LibreOJ #528. 「LibreOJ β Round #4」求和题目要求的是有多少对数满足他们 ...
IntelliJ IDEA 2017 JDK Tomcat Maven 配置步骤详解(一)
要求配置 Java基础环境(实际上应该在虚拟机linux环境下安装CentOS 7,但是我这电脑实在承受不住了) 安装开发工具 IntelliJ IDEA 2017.1 第一部分: JDK ...
Matlab中矩阵的数据结构
在Matlab中,矩阵默认的数据类型是double, 并不是integer. 而且奇怪的是,矩阵乘法默认按照浮点数类型进行, 整数矩阵相乘会报错.另外,可以用a= int16(A)这种形式实现数据类型 ...
《自制编程语言--基于C语言郑钢》学习笔记
<自制编程语言>学习笔记本仓库内容 <自制编程语言>源码 src/sparrow.tgz <自制编程语言>读书笔记 docs/* <自制编程语言>样章 ...

Text Prompted Remote Speaker Authentication : Joint Speech and Speaker Recognition/Verification System :: Major Project ::: Introduction

Text Prompted Remote Speaker Authentication : Joint Speech and Speaker Recognition/Verification System :: Major Project ::: Introduction的更多相关文章

随机推荐

热门专题