转债:  http://www.cardinalpeak.com/blog/the-h-264-sequence-parameter-set/

View from the Peak

The h.264 Sequence Parameter Set

April 20th, 2011 by Ben Mesander

This is a follow-up to my World’s Smallest h.264 Encoder post. I’ve received several emails asking about precise details of things in two entities in the h.264 bitstream: the Sequence Parameter Set (SPS) and the Picture Parameter Set (PPS). Both entities contain information that an h.264 decoder needs to decode the video data, for example the resolution and frame rate of the video.

Recall that an h.264 bitstream contains a sequence of Network Abstraction Layer (NAL) units. The SPS and PPS are both types of NAL units. The SPS NAL unit contains parameters that apply to a series of consecutive coded video pictures, referred to as a “coded video sequence” in the h.264 standard. The PPS NAL unit contains parameters that apply to the decoding of one or more individual pictures inside a coded video sequence.

In the case of my simple encoder, we emitted a single SPS and PPS at the start of the video data stream, but in the case of a more complex encoder, it would not be uncommon to see them inserted periodically in the data for two reasons—first, often a decoder will need to start decoding mid-stream, and second, because the encoder may wish to vary parameters for different parts of the stream in order to achieve better compression or quality goals.

In my trivial encoder, the h.264 SPS and PPS were hardcoded in hex as:

/* h.264 bitstreams */
const uint8_t sps[] =
{0x00, 0x00, 0x00, 0x01, 0x67, 0x42, 0x00, 0x0a, 0xf8, 0x41, 0xa2};
const uint8_t pps[] =
{0x00, 0x00, 0x00, 0x01, 0x68, 0xce, 0x38, 0x80};

Let’s decode this into something readable from the spec. The first thing I did was to look at section 7 of the h.264 specification. I saw that at a minimum I had to choose how to fill in the SPS parameters in the table below. In the table, as in the standard, the type u(n) indicates an unsigned integer of n bits, and ue(v) indicates an unsigned exponential-golomb coded value of a variable number of bits. The spec doesn’t seem to define the maximum number of bits anywhere, but the reference encoder software uses 32. (People wishing to explore the security of decoder software may find it interesting to violate this assumption!)

Parameter Name Type Value Comments
forbidden_zero_bit u(1) 0 Despite being forbidden, it must be set to 0!
nal_ref_idc u(2) 3 3 means it is “important” (this is an SPS)
nal_unit_type u(5) 7 Indicates this is a sequence parameter set
profile_idc u(8) 66 Baseline profile
constraint_set0_flag u(1) 0 We’re not going to honor constraints
constraint_set1_flag u(1) 0 We’re not going to honor constraints
constraint_set2_flag u(1) 0 We’re not going to honor constraints
constraint_set3_flag u(1) 0 We’re not going to honor constraints
reserved_zero_4bits u(4) 0 Better set them to zero
level_idc u(8) 10 Level 1, sec A.3.1
seq_parameter_set_id ue(v) 0 We’ll just use id 0.
log2_max_frame_num_minus4 ue(v) 0 Let’s have as few frame numbers as possible
pic_order_cnt_type ue(v) 0 Keep things simple
log2_max_pic_order_cnt_lsb_minus4 ue(v) 0 Fewer is better.
num_ref_frames ue(v) 0 We will only send I slices
gaps_in_frame_num_value_allowed_flag u(1) 0 We will have no gaps
pic_width_in_mbs_minus_1 ue(v) 7 SQCIF is 8 macroblocks wide
pic_height_in_map_units_minus_1 ue(v) 5 SQCIF is 6 macroblocks high
frame_mbs_only_flag u(1) 1 We will not to field/frame encoding
direct_8x8_inference_flag u(1) 0 Used for B slices. We will not send B slices
frame_cropping_flag u(1) 0 We will not do frame cropping
vui_prameters_present_flag u(1) 0 We will not send VUI data
rbsp_stop_one_bit u(1) 1 Stop bit. I missed this at first and it caused me much trouble.

Some key things here are the profile (profile_idc) and level (level_idc) that I chose, and the picture width and height. If you encode the above table in hex, you will get the values in the SPS array declared above.

A question I got a couple of times in email was about the width and height parameters—specifically, what to do if the picture width or height is not an integer multiple of macroblock size. Recall that, for the 4:2:0 sampling scheme in my encoder, a macroblock consists of 16×16 luma samples. In this case, you would set the frame_cropping_flag to 1, and reduce the number of pixels in the horizontal and vertical direction with the frame_crop_left_offset, frame_crop_right_offset, frame_crop_top_offset, and frame_crop_bottom_offset parameters, which are conditionally present in the bitstream only if the frame_cropping_flag is set to one.

One interesting problem that we see fairly often with h.264 is when the container format (MP4, MOV, etc.) contains different values for some of these parameters than the SPS and PPS. In this case, we find different video players handle the streams differently.

A handy tool for decoding h.264 bitstreams, including the SPS, is the h264bitstream tool. It comes with a command line program that decodes a bitstream to the parameter names defined in the h.264 specification. Let’s look at its output for a sample mp4 file I downloaded from youtube. First, I extract the h.264 NAL units from the file using ffmpeg:

ffmpeg.exe -i Old Faithful.mp4 -vcodec copy -vbsf h264_mp4toannexb -an of.h264

The NAL units now reside in the file of.h264. I then run the h264_analyze command from the h264bitstream package to produce the following output:

h264_analyze of.h264
!! Found NAL at offset 4 (0x0004), size 25 (0x0019)
==================== NAL ====================
forbidden_zero_bit : 0
nal_ref_idc : 3
nal_unit_type : 7 ( Sequence parameter set )
======= SPS =======
profile_idc : 100
constraint_set0_flag : 0
constraint_set1_flag : 0
constraint_set2_flag : 0
constraint_set3_flag : 0
reserved_zero_4bits : 0
level_idc : 31
seq_parameter_set_id : 0
chroma_format_idc : 1
residual_colour_transform_flag : 0
bit_depth_luma_minus8 : 0
bit_depth_chroma_minus8 : 0
qpprime_y_zero_transform_bypass_flag : 0
seq_scaling_matrix_present_flag : 0
log2_max_frame_num_minus4 : 3
pic_order_cnt_type : 0
log2_max_pic_order_cnt_lsb_minus4 : 3
delta_pic_order_always_zero_flag : 0
offset_for_non_ref_pic : 0
offset_for_top_to_bottom_field : 0
num_ref_frames_in_pic_order_cnt_cycle : 0
num_ref_frames : 1
gaps_in_frame_num_value_allowed_flag : 0
pic_width_in_mbs_minus1 : 79
pic_height_in_map_units_minus1 : 44
frame_mbs_only_flag : 1
mb_adaptive_frame_field_flag : 0
direct_8x8_inference_flag : 1
frame_cropping_flag : 0
frame_crop_left_offset : 0
frame_crop_right_offset : 0
frame_crop_top_offset : 0
frame_crop_bottom_offset : 0
vui_parameters_present_flag : 1
=== VUI ===
aspect_ratio_info_present_flag : 1
aspect_ratio_idc : 1
sar_width : 0
sar_height : 0
overscan_info_present_flag : 0
overscan_appropriate_flag : 0
video_signal_type_present_flag : 0
video_signal_type_present_flag : 0
video_format : 0
video_full_range_flag : 0
colour_description_present_flag : 0
colour_primaries : 0
transfer_characteristics : 0
matrix_coefficients : 0
chroma_loc_info_present_flag : 0
chroma_sample_loc_type_top_field : 0
chroma_sample_loc_type_bottom_field : 0
timing_info_present_flag : 1
num_units_in_tick : 100
time_scale : 5994
fixed_frame_rate_flag : 1
nal_hrd_parameters_present_flag : 0
vcl_hrd_parameters_present_flag : 0
low_delay_hrd_flag : 0
pic_struct_present_flag : 0
bitstream_restriction_flag : 1
motion_vectors_over_pic_boundaries_flag : 1
max_bytes_per_pic_denom : 0
max_bits_per_mb_denom : 0
log2_max_mv_length_horizontal : 11
log2_max_mv_length_vertical : 11
num_reorder_frames : 0
max_dec_frame_buffering : 1
=== HRD ===
cpb_cnt_minus1 : 0
bit_rate_scale : 0
cpb_size_scale : 0
initial_cpb_removal_delay_length_minus1 : 0
cpb_removal_delay_length_minus1 : 0
dpb_output_delay_length_minus1 : 0
time_offset_length : 0

The only additional thing I’d like to point out here is that this particular SPS also contains information about the frame rate of the video (see timing_info_present_flag). These parameters must be closely checked when you generate bitstreams to ensure they agree with the container format that the h.264 will eventually be muxed into. Even a small error, such as 29.97 fps in one place and 30 fps in another, can result in severe audio/video synchronization problems.

Next time I will write about the h.264 Picture Parameter Set (PPS).

The h.264 Sequence Parameter Set的更多相关文章

  1. 获得H.264视频分辨率的方法

    转自:http://www.cnblogs.com/likwo/p/3531241.html 在使用ffmpeg解码播放TS流的时候(例如之前写过的UDP组播流),在连接时往往需要耗费大量时间.经过d ...

  2. iOS VideoToolbox硬编H.265(HEVC)H.264(AVC):2 H264数据写入文件

    本文档为iOS VideoToolbox硬编H.265(HEVC)H.264(AVC):1 概述续篇,主要描述: CMSampleBufferRef读取实际数据 序列参数集(Sequence Para ...

  3. H.264中NALU、RBSP、SODB的关系 (弄清码流结构)

    NALU:Coded H.264 data is stored or transmitted as a series of packets known as NetworkAbstraction La ...

  4. World’s Smallest h.264 Encoder

    转载 http://www.cardinalpeak.com/blog/worlds-smallest-h-264-encoder/ View from the Peak World’s Smalle ...

  5. h.264语法结构分析

    NAL Unit Stream Network Abstraction Layer,简称NAL. h.264把原始的yuv文件编码成码流文件,生成的码流文件就是NAL单元流(NAL unit Stre ...

  6. H.264视频的RTP荷载格式

    Status of This Memo This document specifies an Internet standards track protocol for the   Internet ...

  7. 直播一:H.264编码基础知识详解

    一.编码基础概念 1.为什么要进行视频编码? 视频是由一帧帧图像组成,就如常见的gif图片,如果打开一张gif图片,可以发现里面是由很多张图片组成.一般视频为了不让观众感觉到卡顿,一秒钟至少需要16帧 ...

  8. H.264流媒体协议格式中的Annex B格式和AVCC格式深度解析

    版权声明:本文为博主原创文章,未经博主允许不得转载. https://blog.csdn.net/Romantic_Energy/article/details/50508332本文需要读者对H.26 ...

  9. 一步一步解析H.264码流的NALU(SPS,PSS,IDR)获取宽高和帧率

    分析H.264码流的工具 CodecVisa,StreamEye以及VM Analyzer NALU是由NALU头和RBSP数据组成,而RBSP可能是SPS,PPS,Slice或SEI 而且SPS位于 ...

随机推荐

  1. UVa 11922 - Permutation Transformer 伸展树

    第一棵伸展树,各种调试模板……TVT 对于 1 n 这种查询我处理的不太好,之前序列前后没有添加冗余节点,一直Runtime Error. 后来加上冗余节点之后又出了别的状况,因为多了 0 和 n+1 ...

  2. 车牌识别LPR(四)-- 车牌定位

    第四篇:车牌定位 车牌定位就是采用一系列图像处理或者数学的方法从一幅图像中将车牌准确地定位出来.车牌定位提取出的车牌是整个车牌识别系统的数据来源,它的效果的好坏直接影响到整个系统的表现,只有准确地定位 ...

  3. Java基础——关键字

    volatile 用volatile修饰的变量,线程在每次使用变量的时候,都会读取变量修改后的最的值.volatile很容易被误用,用来进行原子性操作. 对于volatile修饰的变量,jvm虚拟机只 ...

  4. 利用Java自带的MD5加密java.security.MessageDigest;

    MD5加密算法,即"Message-Digest Algorithm 5(信息-摘要算法)",它由MD2.MD3.MD4发展而来的一种单向函数算法(也就是HASH算法),它是国际著 ...

  5. c#(.net) 导出 word表格

    做了差不多一周的导出Word,现在把代码贴出来   : ExportWord.cs using System; using System.Collections.Generic; using Syst ...

  6. BZOJ_1024_[SHOI2008]_生日快乐_(dfs)

    描述 http://www.lydsy.com/JudgeOnline/problem.php?id=1024 给出一个\(x*y\)的距形,要求平行于边切,最终切成\(n\)个面积相等的小距形,求长 ...

  7. ajax上传图片 jquery插件 jquery.form.js 的方法 ajaxSubmit; AjaxForm与AjaxSubmit的差异

    先引入脚本  这里最好是把jquery的脚本升级到1.7 <script src="js/jquery-1.7.js" type="text/javascript& ...

  8. hdu 4614 Vases and Flowers(线段树:成段更新)

    线段树裸题.自己写复杂了,准确说是没想清楚就敲了. 先是建点为已插花之和,其实和未插花是一个道理,可是开始是小绕,后来滚雪球了,跪了. 重新建图,分解询问1为:找出真正插画的开始点和终止点,做成段更新 ...

  9. I.MX6 隐藏电池图标

    /********************************************************************** * I.MX6 隐藏电池图标 * 声明: * 有些时候设 ...

  10. UVA 575 Skew Binary (水)

    题意:根据这种进制的算法,例如,给你一个左式,要求推出右式.(其实右式就是一个十进制数,根据这种进位的方法来转成特殊进制的数.) 思路:观察转换特点,有点类似于二进制,但是其在后面还减一了.比如25- ...