Speex is based on CELP, which stands for Code Excited Linear Prediction. This section attempts to introduce the principles behind CELP, so if you are already familiar with CELP, you can safely skip to section 7. The CELP technique is based on three ideas:

  1. The use of a linear prediction (LP) model to model the vocal tract
  2. The use of (adaptive and fixed) codebook entries as input (excitation) of the LP model
  3. The search performed in closed-loop in a ``perceptually weighted domain''

This section describes the basic ideas behind CELP. Note that it's still incomplete.

Linear Prediction (LPC)

Linear prediction is at the base of many speech coding techniques, including CELP. The idea behind it is to predict the signal  using a linear combination of its past samples:

where  is the linear prediction of . The prediction error is thus given by:

The goal of the LPC analysis is to find the best prediction coefficients  which minimize the quadratic error function:

That can be done by making all derivatives  equal to zero:

The  filter coefficients are computed using the Levinson-Durbin algorithm, which starts from the auto-correlation  of the signal .

For an order  filter, we have:

The filter coefficients  are found by solving the system . What the Levinson-Durbin algorithm does here is making the solution to the problem instead of  by exploiting the fact that matrix  is toeplitz hermitian. Also, it can be proven that all the roots of  are within the unit circle, which means that  is always stable. This is in theory; in practice because of finite precision, there are two commonly used techniques to make sure we have a stable filter. First, we multiply  by a number slightly above one (such as 1.0001), which is equivalent to adding noise to the signal. Also, we can apply a window to the auto-correlation, which is equivalent to filtering in the frequency domain, reducing sharp resonances.

The linear prediction model represents each speech sample as a linear combination of past samples, plus an error signal called the excitation (or residual).

In the z-domain, this can be expressed as

where  is defined as

We usually refer to  as the analysis filter and  as the synthesis filter. The whole process is called short-term prediction as it predicts the signal using a prediction using only the  past samples, where  is usually around 10.

Because LPC coefficients have very little robustness to quantization, they are converted to Line Spectral Pair (LSP) coefficients which have a much better behaviour with quantization, one of them being that it's easy to keep the filter stable.

Pitch Prediction

During voiced segments, the speech signal is periodic, so it is possible to take advantage of that property by approximating the excitation signal  by a gain times the past of the excitation:

where  is the pitch period,  is the pitch gain. We call that long-term prediction since the excitation is predicted from  with .

Innovation Codebook

The final excitation  will be the sum of the pitch prediction and an innovation signal  taken from a fixed codebook, hence the name Code Excited Linear Prediction. The final excitation is given by:

The quantization of  is where most of the bits in a CELP codec are allocated. It represents the information that couldn't be obtained either from linear prediction or pitch prediction. In the z-domain we can represent the final signal  as

Analysis-by-Synthesis and Error Weighting

Most (if not all) modern audio codecs attempt to ``shape'' the noise so that it appears mostly in the frequency regions where the ear cannot detect it. For example, the ear is more tolerant to noise in parts of the spectrum that are louder and vice versa. That's why instead of minimizing the simple quadratic error

where  is the encoder signal, we minimize the error for the perceptually weighted signal

where  is the weighting filter, usually of the form

(1)

with control parameters . If the noise is white in the perceptually weighted domain, then in the signal domain its spectral shape will be of the form

If a filter  has (complex) poles at  in the -plane, the filter  will have its poles at , making it a flatter version of .

Analysis-by-synthesis refers to the fact that when trying to find the best pitch parameters () and innovation signal , we do not work by making the excitation  as close as the original one (which would be simpler), but apply the synthesis (and weighting) filter and try making  as close to the original as possible.

参考资料:

1 百科总结: https://zh.wikipedia.org/wiki/%E7%A0%81%E6%BF%80%E5%8A%B1%E7%BA%BF%E6%80%A7%E9%A2%84%E6%B5%8B
2 详细介绍: http://ntools.net/arc/Documents/speex/manual/node8.html

Introduction to CELP Coding的更多相关文章

  1. Spark 大数据平台 Introduction part 2 coding

    Basic Functions sc.parallelize(List(1,2,3,4,5,6)).map(_ * 2).filter(_ > 5).collect() *** res: Arr ...

  2. 算术编码Arithmetic Coding-高质量代码实现详解

    关于算术编码的具体讲解我不多细说,本文按照下述三个部分构成. 两个例子分别说明怎么用算数编码进行编码以及解码(来源:ARITHMETIC CODING FOR DATA COIUPRESSION): ...

  3. Zen Coding in Visual Studio 2012

    http://www.johnpapa.net/zen-coding-in-visual-studio-2012 Zen Coding is a faster way to write HTML us ...

  4. Introduction to ASP.NET Web Programming Using the Razor Syntax (C#)

    1, http://www.asp.net/web-pages/overview/getting-started/introducing-razor-syntax-c 2, Introduction ...

  5. Top 10 Algorithms for Coding Interview--reference

    By X Wang Update History:Web Version latest update: 4/6/2014PDF Version latest update: 1/16/2014 The ...

  6. 转:Top 10 Algorithms for Coding Interview

    The following are top 10 algorithms related concepts in coding interview. I will try to illustrate t ...

  7. Github Coding Developer Book For LiuGuiLinAndroid

    Github Coding Developer Book For LiuGuiLinAndroid 收集了这么多开源的PDF,也许会帮到一些人,现在里面的书籍还不是很多,我也在一点点的上传,才上传不到 ...

  8. 使用Travis CI自动部署博客到github pages和coding pages

    每次换系统或换电脑之后重新部署博客总是很苦恼?想像jekyll那样,一次性部署完成后,以后本地不用安装环境直接 git push 就能生成博客?那推荐你应该使用使用 Travis CI了. 这篇文章我 ...

  9. Introduction to Parallel Computing

    Copied From:https://computing.llnl.gov/tutorials/parallel_comp/ Author: Blaise Barney, Lawrence Live ...

随机推荐

  1. MFC笔记6

    1.MFC文件的读写操作 写操作 创建一个编辑框(IDC_INFOR_EDIT1),在里面输入信息,创建一个按钮(IDC_BUTTON),点击按钮会触发(OnBnClickedButton2()函数) ...

  2. 关于echarts图表在tab页中width:100%失效的问题

    https://www.cnblogs.com/tongrenlu/p/9268250.html

  3. Spring使用Jackson处理json数据

    1.搭建SpringMVC+Spring环境 2.配置web.xml.SpringMVC-config.xml <?xml version="1.0" encoding=&q ...

  4. canvas(三) star- demo

    /** * Created by xianrongbin on 2017/3/8. * 本例子使用渐变画出 璀璨星空 */ var dom = document.getElementById('clo ...

  5. Cannot resolve method

    1.问题描述: 本人idea下拉项目,结果impl里的类满屏的红色 2.解决方法 步骤: File --> Settings --> Plugins -->搜索并安装lom即可

  6. php正则提取html图片(img)src地址与任意属性的方法

    <?php /*PHP正则提取图片img标记中的任意属性*/ $str = '<center><img src="/uploads/images/2017020716 ...

  7. 别人的Linux私房菜(10)vim程序编辑器

    很多软件的编辑接口会主动调用vi vi分一般命令模式.编辑模式.命令行模式. 使用vi :/bin/vi welcome.txt 下下端显示文本有多少行,多少字符, 一般命令模式: 上下左右移动光标k ...

  8. 20175316 盛茂淞 实验一 Java开发环境的熟悉

    20175316 盛茂淞 实验一 Java开发环境的熟悉 实验目的 使用JDK编译.运行简单的Java程序 实验要求 1.建立"自己学号exp1"的目录 2.在"自己学号 ...

  9. 《Linux就该这么学》第七天课程

    昨天晚上我找了刘老师决定了报考红帽RHCSA,RHCE认证,我不指望这个认证能给我带来工作上的某些福利,毕竟出去闯靠的是实力外加运气 我只是希望通过这个认证来激励自己! 下面是分享的一些干货! 原创地 ...

  10. 上传input中file文件到云端,并返回链接

    有的文件.图片等信息可以上传到云端上,然后使用链接调用,这样会更加的方便和快捷. <form id="form"> <input type="file& ...