转载

http://www.cardinalpeak.com/blog/worlds-smallest-h-264-encoder/

View from the Peak

World’s Smallest h.264 Encoder

March 19th, 2010 by Ben Mesander

Recently I have been studying the h.264 video codec and reading the ISO spec. h.264 a much more sophisticated codec than MPEG-2, which means that a well-implemented h.264 encoder has more compression tools at its disposal than the equivalent MPEG-2 encoder. But all that sophistication comes at a price: h.264 also has a big, complicated specification with a plethora of options, many of which are not commonly used, and it takes expertise to understand which parts are important to solve a given problem.

As a bit of a parlor trick, I decided to write the simplest possible h.264 encoder. I was able to do it in about 30 lines of code—although truth in advertising compels me to admit that it doesn’t actually compress the video at all!

While I don’t want to balloon this blog post with a detailed description of h.264, a little background is in order. An h.264 stream contains the encoded video data along with various parameters needed by a decoder in order to decode the video data. To structure this data, the bitstream consists of a sequence of Network Abstraction Layer (NAL) units.

Previous MPEG specifications allowed pictures to be coded as I-frames, P-frames, or B-frames. h.264 is more complex and wonderful. It allows individual frames to be coded as multiple slices, each of which can be of type I, P, or B, or even more esoteric types. This feature can be used in creative ways to achieve different video coding goals. In our encoder we will use one slice per frame for simplicity, and we will use all I-frames.

As with previous MPEG specifications, in h.264 each slice consists of one or more 16×16 macroblocks. Each macroblock in our 4:2:0 sampling scheme contains 16×16 luma samples, and two 8×8 blocks of chroma samples. For this simple encoder, I won’t be compressing the video data at all, so the samples will be directly copied into the h.264 output.

With that background in mind, for our simplest possible encoder, there are three NALs we have to emit:

  1. Sequence Parameter Set (SPS): Once per stream
  2. Picture Parameter Set (PPS): Once per stream
  3. Slice Header: Once per video frame
    1. Slice Header information
    2. Macroblock Header: Once per macroblock
    3. Coded Macroblock Data: The actual coded video for the macroblock

Since the SPS, the PPS, and the slice header are static for this application, I was able to hand-code them and include them in my encoder as a sequence of magic bits.

Putting it all together, I came up with the following code for what I call “hello264”:

#include <stdio.h>
#include <stdlib.h>
/* SQCIF */
#define LUMA_WIDTH 128
#define LUMA_HEIGHT 96
#define CHROMA_WIDTH LUMA_WIDTH / 2
#define CHROMA_HEIGHT LUMA_HEIGHT / 2
/* YUV planar data, as written by ffmpeg */
typedef struct
{
uint8_t Y[LUMA_HEIGHT][LUMA_WIDTH];
uint8_t Cb[CHROMA_HEIGHT][CHROMA_WIDTH];
uint8_t Cr[CHROMA_HEIGHT][CHROMA_WIDTH];
} __attribute__((__packed__)) frame_t;
frame_t frame;
/* H.264 bitstreams */
const uint8_t sps[] = { 0x00, 0x00, 0x00, 0x01, 0x67, 0x42, 0x00,
0x0a, 0xf8, 0x41, 0xa2 };
const uint8_t pps[] = { 0x00, 0x00, 0x00, 0x01, 0x68, 0xce,
0x38, 0x80 };
const uint8_t slice_header[] = { 0x00, 0x00, 0x00, 0x01, 0x05, 0x88,
0x84, 0x21, 0xa0 };
const uint8_t macroblock_header[] = { 0x0d, 0x00 };
/* Write a macroblock's worth of YUV data in I_PCM mode */
void macroblock(const int i, const int j)
{
int x, y;
if (! ((i == 0) && (j == 0)))
{
fwrite(&macroblock_header, 1, sizeof(macroblock_header),
stdout);
}
for(x = i*16; x < (i+1)*16; x++)
for (y = j*16; y < (j+1)*16; y++)
fwrite(&frame.Y[x][y], 1, 1, stdout);
for (x = i*8; x < (i+1)*8; x++)
for (y = j*8; y < (j+1)*8; y++)
fwrite(&frame.Cb[x][y], 1, 1, stdout);
for (x = i*8; x < (i+1)*8; x++)
for (y = j*8; y < (j+1)*8; y++)
fwrite(&frame.Cr[x][y], 1, 1, stdout);
}
/* Write out PPS, SPS, and loop over input, writing out I slices */
int main(int argc, char **argv)
{
int i, j;
fwrite(sps, 1, sizeof(sps), stdout);
fwrite(pps, 1, sizeof(pps), stdout);
while (! feof(stdin))
{
fread(&frame, 1, sizeof(frame), stdin);
fwrite(slice_header, 1, sizeof(slice_header), stdout);
for (i = 0; i < LUMA_HEIGHT/16 ; i++)
for (j = 0; j < LUMA_WIDTH/16; j++)
macroblock(i, j);
fputc(0x80, stdout); /* slice stop bit */
}
return 0;
}

(This source code is available as a single file here.)

In main(), the encoder writes out the SPS and PPS. Then it reads YUV data from standard input, stores it in a frame buffer, and then writes out a h.264 slice header. It then loops over each macroblock in the frame and calls the macroblock() function to output a macroblock header indicating the macroblock is coded as I_PCM, and inserts the YUV data.

To use the code, you will need some uncompressed video. To generate this, I used the ffmpeg package to convert a QuickTime movie from my Kodak Zi8 video camera from h.264 to SQCIF (128×96) planar YUV format sampled at 4:2:0:

ffmpeg.exe -i angel.mov -s sqcif -pix_fmt yuv420p angel.yuv

I compile the h.264 encoder:

gcc –Wall –ansi hello264.c –o hello264

And run it:

hello264 <angel.yuv >angel.264

Finally, I use ffmpeg to copy the raw h.264 NAL units into an MP4 file:

ffmpeg.exe -f h264 -i angel.264 -vcodec copy angel.mp4

Here is the resulting output:

There you have it—a complete h.264 encoder that uses minimal CPU cycles, with output larger than its input!

The next thing to add to this encoder would be CAVLC coding of macroblocks and intra prediction. The encoder would still be lossless at this point, but there would start to be compression of data. After that, the next logical step would be quantization to allow lossy compression, and then I would add P slices. As a development methodology, I prefer to bring up a simplistic version of an application, get it running, and then add refinements iteratively.

UPDATE 4/20/11: I’ve written more about the Sequence Parameter Set (SPS) here.

Ben Mesander has more than 18 years of experience leading software development teams and implementing software. His strengths include Linux, C, C++, numerical methods, control systems and digital signal processing. His experience includes embedded software, scientific software and enterprise software development environments.

World’s Smallest h.264 Encoder的更多相关文章

  1. The h.264 Sequence Parameter Set

    转债:  http://www.cardinalpeak.com/blog/the-h-264-sequence-parameter-set/ View from the Peak The h.264 ...

  2. H.264开源解码器评测

    转自:http://wmnmtm.blog.163.com/blog/static/38245714201142883032575/ 要播放HDTV,就首先要正确地解开封装,然后进行视频音频解码.所以 ...

  3. 【图像处理】H.264开源解码器评测

    转自:http://wmnmtm.blog.163.com/blog/static/38245714201142883032575/ 要播放HDTV,就首先要正确地解开封装,然后进行视频音频解码.所以 ...

  4. H.264 Profile、Level、Encoder三张简图 (fps = AVCodecContext->time_base.den / AVCodecContext->time_base.num)

    H.264 Profiles Profiles are sets of capabilities. If your black box only supports the Baseline profi ...

  5. H.264视频的RTP荷载格式

    Status of This Memo This document specifies an Internet standards track protocol for the   Internet ...

  6. 使用VideoToolbox硬编码H.264<转>

    文/落影loyinglin(简书作者)原文链接:http://www.jianshu.com/p/37784e363b8a著作权归作者所有,转载请联系作者获得授权,并标注“简书作者”. ======= ...

  7. H.264 / MPEG-4 Part 10 White Paper-翻译

    1. Introduction Broadcast(广播) television and home entertainment(娱乐) have been revolutionised(彻底改变) b ...

  8. 转:MediaCoder H.264格式编码参数设置及详解

    转: http://mediacoder.com.cn/node/81 由于现在大部分视频转码都选择H.264格式进行编码,同时CUDA编码的画质还达不到x264软编码的质量(如果你对画质无要求,可以 ...

  9. C++实现RTMP协议发送H.264编码及AAC编码的音视频

    http://www.cnblogs.com/haibindev/archive/2011/12/29/2305712.html C++实现RTMP协议发送H.264编码及AAC编码的音视频 RTMP ...

随机推荐

  1. Nodejs实现web静态服务器对多媒体文件的支持

    前几天,一个同事说他写的web静态服务器不支持音视频的播放,现简单实现一下. 原理:实现http1.1协议的range部分. 其实这一点都不神秘,我们常用的下载工具,如迅雷,下载很快,还支持断点续传, ...

  2. Linux使用者管理(1)---用户账号

    linux很重要的应用就是作为服务器的操作系统.服务器的作用是给多用户提供各种“服务”(可能是读服务器上的文件,或者是利用服务器进行数值计算)那么如果多用户共同拥有一台服务器,就需要对服务器上的用户进 ...

  3. Android 学习(一)

    这几天被一些功能折磨的要死了,于是放下了这个,看点其它的东西,算是转移一下焦点.床头放了不少书籍,也都被翻阅过,翻阅过,却不曾细细的品味过,俗话说,书可借而不可买也,这话用到自己的身上丝毫不错.因为是 ...

  4. openfire的smack和asmack

    smack你可以看成是一套封装好了的用于实现XMPP协议传输的API,它是一个非常简单并且功能强大的类库,给用户发送消息只需要三行代码.下载地址:http://www.igniterealtime.o ...

  5. VS2012 开发SharePoint 2013 声明式workflow action(activity)之 HelloWorld

    本文讲述VS2012 开发SharePoint 2013 声明式workflow action 之 HelloWorld. 使用VS2012开发客户化的workflow action是SharePoi ...

  6. MySQL 5.7 SYS scheme解析

    sys 库是MySQL 5.7其中的一个系统库,里面有很多很好用的跟性能相关的视图.函数和存储过程, 增强MySQL的易用性 例如:哪些语句使用了临时表,哪个用户请求了最多的io,哪个线程占用了最多的 ...

  7. Asterisk服务安装配置和启动

    Asterisk服务安装配置和启动 2014年11月4日 11:36 注意: 更新源的重要性 源的地址: http://fffo.blog.163.com/blog/static/2119130682 ...

  8. [POJ2828]Buy Tickets(线段树,单点更新,二分,逆序)

    题目链接:http://poj.org/problem?id=2828 由于最后一个人的位置一定是不会变的,所以我们倒着做,先插入最后一个人. 我们每次处理的时候,由于已经知道了这个人的位置k,这个位 ...

  9. android截屏:保存一个view的内容为图片并存放到SD卡

    项目中偶尔会用到截屏分享,于是就有了下面这个截屏的方法~ 下面得saveImage()方法就是保存当前Activity对应的屏幕所有内容的截屏保存. private void saveImage() ...

  10. UVa 1625 Color Length

    思路还算明白,不过要落实到代码上还真敲不出来. 题意: 有两个由大写字母组成的颜色序列,将它们合并成一个序列:每次可以把其中一个序列开头的颜色放到新序列的尾部. 对于每种颜色,其跨度定义为合并后的序列 ...