Introduction to CELP Coding
Speex is based on CELP, which stands for Code Excited Linear Prediction. This section attempts to introduce the principles behind CELP, so if you are already familiar with CELP, you can safely skip to section 7. The CELP technique is based on three ideas:
- The use of a linear prediction (LP) model to model the vocal tract
- The use of (adaptive and fixed) codebook entries as input (excitation) of the LP model
- The search performed in closed-loop in a ``perceptually weighted domain''
This section describes the basic ideas behind CELP. Note that it's still incomplete.
Linear Prediction (LPC)
Linear prediction is at the base of many speech coding techniques, including CELP. The idea behind it is to predict the signal
using a linear combination of its past samples:

where
is the linear prediction of
. The prediction error is thus given by:

The goal of the LPC analysis is to find the best prediction coefficients
which minimize the quadratic error function:

That can be done by making all derivatives
equal to zero:

The
filter coefficients are computed using the Levinson-Durbin algorithm, which starts from the auto-correlation
of the signal
.

For an order
filter, we have:


The filter coefficients
are found by solving the system
. What the Levinson-Durbin algorithm does here is making the solution to the problem
instead of
by exploiting the fact that matrix
is toeplitz hermitian. Also, it can be proven that all the roots of
are within the unit circle, which means that
is always stable. This is in theory; in practice because of finite precision, there are two commonly used techniques to make sure we have a stable filter. First, we multiply
by a number slightly above one (such as 1.0001), which is equivalent to adding noise to the signal. Also, we can apply a window to the auto-correlation, which is equivalent to filtering in the frequency domain, reducing sharp resonances.
The linear prediction model represents each speech sample as a linear combination of past samples, plus an error signal called the excitation (or residual).

In the z-domain, this can be expressed as

where
is defined as

We usually refer to
as the analysis filter and
as the synthesis filter. The whole process is called short-term prediction as it predicts the signal
using a prediction using only the
past samples, where
is usually around 10.
Because LPC coefficients have very little robustness to quantization, they are converted to Line Spectral Pair (LSP) coefficients which have a much better behaviour with quantization, one of them being that it's easy to keep the filter stable.
Pitch Prediction
During voiced segments, the speech signal is periodic, so it is possible to take advantage of that property by approximating the excitation signal
by a gain times the past of the excitation:

where
is the pitch period,
is the pitch gain. We call that long-term prediction since the excitation is predicted from
with
.
Innovation Codebook
The final excitation
will be the sum of the pitch prediction and an innovation signal
taken from a fixed codebook, hence the name Code Excited Linear Prediction. The final excitation is given by:

The quantization of
is where most of the bits in a CELP codec are allocated. It represents the information that couldn't be obtained either from linear prediction or pitch prediction. In the z-domain we can represent the final signal
as

Analysis-by-Synthesis and Error Weighting
Most (if not all) modern audio codecs attempt to ``shape'' the noise so that it appears mostly in the frequency regions where the ear cannot detect it. For example, the ear is more tolerant to noise in parts of the spectrum that are louder and vice versa. That's why instead of minimizing the simple quadratic error

where
is the encoder signal, we minimize the error for the perceptually weighted signal

where
is the weighting filter, usually of the form
![]() |
(1) |
with control parameters
. If the noise is white in the perceptually weighted domain, then in the signal domain its spectral shape will be of the form

If a filter
has (complex) poles at
in the
-plane, the filter
will have its poles at
, making it a flatter version of
.
Analysis-by-synthesis refers to the fact that when trying to find the best pitch parameters (
,
) and innovation signal
, we do not work by making the excitation
as close as the original one (which would be simpler), but apply the synthesis (and weighting) filter and try making
as close to the original as possible.
参考资料:
1 百科总结: https://zh.wikipedia.org/wiki/%E7%A0%81%E6%BF%80%E5%8A%B1%E7%BA%BF%E6%80%A7%E9%A2%84%E6%B5%8B
2 详细介绍: http://ntools.net/arc/Documents/speex/manual/node8.html
Introduction to CELP Coding的更多相关文章
- Spark 大数据平台 Introduction part 2 coding
Basic Functions sc.parallelize(List(1,2,3,4,5,6)).map(_ * 2).filter(_ > 5).collect() *** res: Arr ...
- 算术编码Arithmetic Coding-高质量代码实现详解
关于算术编码的具体讲解我不多细说,本文按照下述三个部分构成. 两个例子分别说明怎么用算数编码进行编码以及解码(来源:ARITHMETIC CODING FOR DATA COIUPRESSION): ...
- Zen Coding in Visual Studio 2012
http://www.johnpapa.net/zen-coding-in-visual-studio-2012 Zen Coding is a faster way to write HTML us ...
- Introduction to ASP.NET Web Programming Using the Razor Syntax (C#)
1, http://www.asp.net/web-pages/overview/getting-started/introducing-razor-syntax-c 2, Introduction ...
- Top 10 Algorithms for Coding Interview--reference
By X Wang Update History:Web Version latest update: 4/6/2014PDF Version latest update: 1/16/2014 The ...
- 转:Top 10 Algorithms for Coding Interview
The following are top 10 algorithms related concepts in coding interview. I will try to illustrate t ...
- Github Coding Developer Book For LiuGuiLinAndroid
Github Coding Developer Book For LiuGuiLinAndroid 收集了这么多开源的PDF,也许会帮到一些人,现在里面的书籍还不是很多,我也在一点点的上传,才上传不到 ...
- 使用Travis CI自动部署博客到github pages和coding pages
每次换系统或换电脑之后重新部署博客总是很苦恼?想像jekyll那样,一次性部署完成后,以后本地不用安装环境直接 git push 就能生成博客?那推荐你应该使用使用 Travis CI了. 这篇文章我 ...
- Introduction to Parallel Computing
Copied From:https://computing.llnl.gov/tutorials/parallel_comp/ Author: Blaise Barney, Lawrence Live ...
随机推荐
- ArcPy开发教程1-面向ArcGIS的Python语言基础
ArcPy开发教程1-面向ArcGIS的Python语言基础 联系方式:谢老师,135-4855-4328,xiexiaokui#qq.com 第一节课 时间2019年2月26日 上午第一节 讲解:A ...
- 【javascript知识温习】设计模式--单例模式
var Singleton = (function(){ var instance; function init() { '; function privateMethod() { console.l ...
- 使用@FeignClient时,报java.lang.NoClassDefFoundError: feign/Feign$Builder错
错误信息: Caused by: java.lang.ClassNotFoundException: feign.Feign$Builder at java.net.URLClassLoader.fi ...
- Tomcat-servlet基础
1.1 概念 运行在服务器上的小程序 定义了浏览器访问到(tomact)的规则 1.2 步骤 1.3 执行原理 1 当服务器 接收到客户端浏览器的请求后 会解析url地址 获得url路径 ...
- 安装php调试工具 Xdebug的步骤 火狐 phpstorm联调
一 安装服务器端 1 选择你的版本 <?php phpinfo(); ?> 比如我的: 关键是这三项:PHP Version 7.3.0Architecture x86 (x86是32位系 ...
- 20162322 朱娅霖 作业005&006 栈,队列
20162322 2017-2018-1 <程序设计与数据结构>第五.六周学习总结 教材学习内容总结 集合的介绍(总述) 集合是收集并组织其他对象的对象.主要分为线性集合(集合中的元素排成 ...
- IPV4/IPV6双协议栈配置案例
拓扑: XRV1配置: =================================================================== hostname XRV1! ipv6 ...
- 十字线阵---CBF,传统波束形成
%传统波束形成,CBF (Ps:这个程序是别人的,不是我写的,但是具体是在哪里找到的已经忘了) clear all; close all; clc; %---------初始化常量---------- ...
- 网络编程初识和socket套接字
网络的产生 不同机器上的程序要通信,才产生了网络:凡是涉及到倆个程序之间通讯的都需要用到网络 软件开发架构 软件开发架构的类型:应用类.web类 应用类:qq.微信.网盘.优酷这一类是属于需要安装的桌 ...
- UI 设计的整个工作流程是怎样的?
作为一个专业UI设计师,不仅仅要了解整个产品在UI界面设计,交互设计中的工作流程,更需要了解整个产品从需求提出到产品上线的整个工作流程. 以下是互联网产品个部门的工作分配及流程: 从图中可以看到,一个 ...
