AC3 encoder flow

AC3 encoder flow 如下：

1.input PCM

PCM在进入encoder前会使用high pass filter来移除信号的DC部分来达到更有效的编码。

2.Transient detection

Transient detection用于决定在进行MDCT时是否需要switch到short block来减少pre-echo。

Transient detection 分为以下几个步骤：

1）High pass filter.使用二阶IIR filter (cutoff of 8kHz)。

2）Block Segmentation.

一个audio block（256 sample）通过HP filter后，segmented 到hierarchical tree的不同level上。

level 1为256 sample. level 2为2个长度为128 sample 的segment.level3 为4个长度为64 sample的segment.

3)Peak detection

在hierarchical tree的各个level的各个segment上找到最大PCM的最大幅值。

P[j][k] = max(x(n))

for n = (512 ×(k-1) / 2^j), (512 ×(k-1) / 2^j) + 1, ...(512 ×k / 2^j) - 1
and k = 1, ..., 2^(j-1) ;

其中x(n)为256 sample 的audio block 中第n个sample

j=1,2,3 为hierachical tree的level

k为level j的segment index.

4)Thread Comparison

首先用P[1][1]与“silence threshold”进行比较来check当前audio block中是否存在signifant signal level.

如果P[1][1]小于silence threshold则使用long block. silence threshold为100/32768

接下来，在每个level上比较相邻的segment的Peak.如果相邻的两个segment Peak的比值大于预设的threshold,就设置一个flag来标识当前audio block存在transient.

比较方法如下：

mag(P[j][k]) ×T[j] > mag(P[j][(k-1)])

T[j]是level j预设的threshold.T[1]=0.1,T[2]=0.075,T[3]=0.05.

3.Forward Transform

每个audio block在进行MDCT transform之前，需要乘以window function来减少transform的边界效应。

使用MDCT进行时频变换。

x[n]表示乘以window function之后的时域信号，如果使用long block，N=512.如果使用short block， N=256.

4.Coupling Strategy

1)basis encoder

basic encoder 使用static coupling strategy, couplig parameter 如下：

2）advanced encoder

更多的advanced encoder使用动态变化的coupling parameter. coupling frequecies根据psychoacoustic model 分析bit 的需求量来动态变化。

如果某个channel的信号在时间上变化剧烈，就从coupling中移除。在时间上变化缓慢的channel，其coupling coordinate传送bit少很多。

coupling band structure也会动态变化。

5.Form Coupling Channel

大部分basic encoder将所有individual channel的transform coefficients简单的相加，然后除以8（防止transform coefficients超过1溢出）形成coupling channel。

一些稍微复杂点的encoder在进行相加钱会改变individual channel的符号来避免phase cancellations.

在每个coupling band内，原始channel的能量除以coupling channel中对应coupling band内能量的比值形成coupling coordinates.

6.Rematrixing

Rematrixing只存在于2/0 mode.在每个rematrixing band内，计算L,R,L-R,L+R的能量，如果最大的能量是L or R,那当前band不设置rematrixing flag.如果最大的能量是L+R orL- R,那当前band设置rematrixing flag,并传送L+R,L-R.

7.Extract exponent

将transform coefficient表示成二进制后，leading zero的个数为初始的exponent.

每个transform coefficients计算出一个exponent,可以选择不同的exponent strategy将多个exponent group在一起。

8.Exponent Strategy

如果频谱变化平坦，则使用D25 or D45,如果变化不平坦，这是用频谱分辨率较高的exponent strategy D15 or D25.

如果频谱在一个frame的6个audio block内改变很小，则只在audio block 0传送exponents, block 1~5使用block 0的exponents.

对于basic encoder,check时间上exponents的变化，如果变化超过了一个threshold，那么就传送新的exponent。

9.Dither strategy

当transform coefficient被quantize为0bit时，在decoder端使用dither替代transform coefficients.

10.Encoder Exponents

对于exponent strategy D25,D45,一个exponent对应多个mantissa. exponent采用差分编码。两个相邻的exponent的差值被限定为+/-2内，如果相邻的exponent差值大于2，则减少较大的exponent到+/-2范围内，相应的mantissa经调整后包含leading zero.

11.Normalize mantissa

每个channel的transform coefficients根据exponents左移得到normalized mantissa.

12.Core Bit Allocation

Core bit allocation是通过不断调整corse SNR 和fine SNR offset直到一个frame内的所有可用的bit都分配完。

corse SNR offset调整时每次增加/减少3db，fine SNR offset 调整时每次增加/减少3/16db.

对于所有channel,都是从一个common bit pool来进行bit allocation。在encoder通过不断的迭代选择最优的csnroffst 和fineoffst,分配不超过frame size的最大bit 数。

对于某一次迭代，如果分配的bit数超过了bit pool,那么减小SNR offset来进行下一次迭代。如果分配的bit数小于bit pool,那么增加SNR offset来进行下一次迭代。当SNR offset已经是满足分配bit数不超过bit pool的最大值时，迭代结束。

bit allocation的最终结果是csnroffst，fineoffst和baps(bit allocation points)

13. Quantize mantissa

每个normallized mantissa使用对应bap的quantizer进行quantize.

14.Pack AC-3 Frame

将上述过程产生的side info，exponents和quantized mantissa pack成AC-3 frame

AC3 encoder flow的更多相关文章

AC3 IMDCT
AC3 encoder 在进行MDCT时,使用两种长度的block. 512 samples的block用于输入信号频谱是stationary,或者在时间上变化缓慢.在fs 为48k时,使用512 s ...
AC3 Rematrix
当L R channel highly correlated时,AC3 encoder 使用rematrix技术压缩L/R的和和差. 原始信号为left,right,使用rematrix压缩信号为le ...
AC3 overview
1.AC3 encode overview AC3 encoder的框图如下: AC3在频域采用粗量化(coarsely quantizing)来获取较高的压缩率. 1).输入PCM 经过MDCT变换 ...
ffmpeg最全的命令参数
Hyper fast Audio and Video encoderusage: ffmpeg [options] [[infile options] -i infile]... {[outfile ...
Motion control encoder extrapolation
Flying Saw debug Part1 Encoder extrapolation Machine introduction A tube cutting saw, is working for ...
光流法(optical flow)
光流分为稠密光流和稀疏光流光流(optic flow)是什么呢?名字很专业,感觉很陌生,但本质上,我们是最熟悉不过的了.因为这种视觉现象我们每天都在经历.从本质上说,光流就是你在这个运动着的世界里感 ...
Improved Variational Inference with Inverse Autoregressive Flow
目录概主要内容代码 Kingma D., Salimans T., Jozefowicz R., Chen X., Sutskever I. and Welling M. Improved Va ...
Improving Variational Auto-Encoders using Householder Flow
目录概主要内容代码 Tomczak J. and Welling M. Improving Variational Auto-Encoders using Householder Flow. N ...
Git 在团队中的最佳实践--如何正确使用Git Flow
我们已经从SVN 切换到Git很多年了,现在几乎所有的项目都在使用Github管理, 本篇文章讲一下为什么使用Git, 以及如何在团队中正确使用. Git的优点 Git的优点很多,但是这里只列出我认为 ...

随机推荐

19、vue部署路由模式
vue-router 默认 hash 模式 -- 使用 URL 的 hash 来模拟一个完整的 URL,于是当 URL 改变时,页面不会重新加载. hash模式带#号不用配置服务器如果不想要很丑的 ...
pip 更换镜像源
国内的pip源阿里云:https://mirrors.aliyun.com/pypi/simple/ 清华:https://pypi.tuna.tsinghua.edu.cn/simple 中国科技 ...
source insight增加tab标签页的方法之sihook
1.效果如下 2.方法见如下博客 http://www.cnblogs.com/Red_angelX/archive/2013/01/23/2873603.html
CSS：display:flex详解
水平居中很容易实现,但是一般垂直居中好像不是很好实现,一般我们都会用position.left等等进行定位:但是flex很好的解决了这个问题 Flex就是"弹性布局",现在应用很多 ...
overfitting &&underfitting
1.过拟合然能完美的拟合模型,但是拟合出来的模型会含有大量的参数,将会是一个含有大量参数的非常庞大的模型,因此不利于实现 1.1解决过拟合的方法 1.1.1 特征选择,通过选取特征变量来减少模型参数 ...
Excel如何快速选定所需数据区域
在使用Excel处理数据时,快速选定所需数据区域的一些小技巧. 第一种方法:(选定指定区域) Ctrl+G调出定位对话框,在[引用位置]处输入A1:E5000,点击[确定]即可. 第二种方法:(选定 ...
UTF-8与GBK的区别
中文解码提示UnicodeDecodeError,UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd6 in position 0: inv ...
熵权法（the Entropy Weight Method）以及MATLAB实现
按照信息论基本原理的解释,信息是系统有序程度的一个度量,熵是系统无序程度的一个度量:如果指标的信息熵越小,该指标提供的信息量越小,在综合评价中所起作用理当越小,权重就应该越低.因此,可利用信息熵这个工 ...
[SDOI2013] 直径 - 树形dp
对于给定的一棵树,其直径的长度是多少,以及有多少条边满足所有的直径都经过该边. Solution 有点意思先随便求一条直径(两次DFS即可),不妨设为 \(s,t\),我们知道要求的这些边一定都在这 ...
用MyEclipse远程debug
第一步编辑 tomcat下的文件startup.sh文件,我的路径是 /root/apache-tomcat-6.0.24/bin/startup.sh 命令:vim startup.sh将decl ...

AC3 encoder flow

AC3 encoder flow的更多相关文章

随机推荐

热门专题