首先来看mp4的封装格式,mp4数据都被放在一个个的箱子当中,也就是box,box的字节序为网络字节序,也就是大端存储,box由header和body组成,header指明box的大小和类型,body根据header的类型存储对应的内容。

box size有三种可能:

box开头的4个字节为box size,该大小包括box header以及整个box的大小,这样我们就可以在文件中定位各个box

box size为1,则表明这个box的大小为large size(mdat)

box size为0,表明这个box是文件的最后一个box,文件结尾即box的结尾

box size后面紧接着是32位的box type,一般为4个字符,比如ftyp moov等(整个box header为8字节),来看看比较重要的box type:

ftyp box:file type,该box只能有一个,该box应该被放在文件的最开始,指示该mp4文件应用的相关信息,不能被其他box包含;

moov box:一种容器箱子container box,意思是该box中装的是box,该box中包含有文件媒体的元数据信息,具体信息要通过解析子box获得;该box只有一个,并且不能被其他box包含;一般情况下会包含一个mvhd子box和若干trak子box;该box是解析mp4文件最重要的一个box,包含了音视频数据的编码格式、音视频数据样本、chunks大小、存储位置(offset,为音视频每帧数据在mdat box中的具体位置)、DTS、PTS等;

mvhd box:movie header box,描述了具体音频或视频流无关的文件整体信息,duration为媒体时长和timescale为时长单位

trak box:track box,它是一个container box,包含了该track的媒体数据的引用和描述。trak box必须韩寒有一个tkhd 和 一个mdia 子box

tkhd box:trak header box,描述track的信息的box,如果是视频会有宽高信息

elst box:记录了流的起始时间,该值可用来计算PTS和DTS

mdia box:track media structure 描述了这条音视频track的媒体数据样本的主要信息,非常重要!同样它也是一个container box,包含有mdhd、hdlr、minf等box

mdhd box:存储有当前track的timescale 和 duration信息,这里的timescale和duration和mvhd box中是不一样的,这里的信息是当前track用于计算媒体时长的信息,计算真正的duration需要用该值除以timescale

hdlr box:存储了当前track的stream type,是video还是audio,但是在MPEG4Extractor中似乎并不是按照这个信息来判断audio和video的

stbl box:子box中存储了codec type以及相关信息,每帧视频在文件中的位置以及PTS等信息

stsd box:该box的子box用于存储当前track的编码类型,如果是avc那么它的子box avcC会存储有SPS、PPS等信息

stts box:decoding time to samp box,保存有参数对sample_count 和 sample_delta,sample_delta可以理解为sample的持续时间,除以mdhd中的timescale就是真实时间,1/(sample_delta / timescale)这样就可以计算出帧率了

stss box:sync sample box,存放了关键帧的序号,seek时需要从关键帧开始解码,里面有个entry count表示关键帧数量

ctss box:composition time to sample box,表示PTS和DTS之间的差值,如果没有该box,说明不存在B帧,PTS等于DTS;DTS计算方法sample_delta * sample_cnt - start_time,如果有B帧那么PTS计算方法为DTS+composition_offset

stsc box:sample to chunk box,媒体数据样本被打包进chunks,chunks和样本samples大小不固定,该box说明chunks关联样本的信息

stsz box:sample size box,记录了每个样本的大小,

stco box:chunk offset box,描述每个chunk相对文件的偏移量,需要根据stsc中的信息计算每个sample对应的offset

参考:mp4封装格式各box类型讲解及IBP帧计算 - 知乎 (zhihu.com)

参考:视频解码研究之PTS(2)Mp4格式,AVI格式和MKV格式_面海烹鲜的博客-CSDN博客_avi pts

MP4在线解析:Online Mp4 Parser

接下来看看MPEG4Extractor中是如何解析文件的。

status_t MPEG4Extractor::parseChunk(off64_t *offset, int depth) {
ALOGV("entering parseChunk %lld/%d", (long long)*offset, depth); if (*offset < 0) {
ALOGE("b/23540914");
return ERROR_MALFORMED;
}
if (depth > 100) {
ALOGE("b/27456299");
return ERROR_MALFORMED;
} // 先读取8个字节,前4个字节为box size,后4个字节为box type
uint32_t hdr[2];
if (mDataSource->readAt(*offset, hdr, 8) < 8) {
return ERROR_IO;
}
uint64_t chunk_size = ntohl(hdr[0]);
int32_t chunk_type = ntohl(hdr[1]);
off64_t data_offset = *offset + 8; // 如果truck size 为1,说明为mdat box,这个box的最小值为16
if (chunk_size == 1) {
if (mDataSource->readAt(*offset + 8, &chunk_size, 8) < 8) {
return ERROR_IO;
}
chunk_size = ntoh64(chunk_size);
data_offset += 8; if (chunk_size < 16) {
// The smallest valid chunk is 16 bytes long in this case.
return ERROR_MALFORMED;
}
} else if (chunk_size == 0) { // 如果chunk_size 为 0 说明当前为最后一个box
if (depth == 0) {
// atom extends to end of file
off64_t sourceSize;
if (mDataSource->getSize(&sourceSize) == OK) {
chunk_size = (sourceSize - *offset); // 最后一个box的size需要根据文件大小来判断
} else {
// XXX could we just pick a "sufficiently large" value here?
ALOGE("atom size is 0, and data source has no size");
return ERROR_MALFORMED;
}
} else {
// not allowed for non-toplevel atoms, skip it
*offset += 4;
return OK;
}
} else if (chunk_size < 8) {
// The smallest valid chunk is 8 bytes long.
ALOGE("invalid chunk size: %" PRIu64, chunk_size);
return ERROR_MALFORMED;
} char chunk[5];
// 将type转换为ASSIC码
MakeFourCCString(chunk_type, chunk);
ALOGV("chunk: %s @ %lld, %d", chunk, (long long)*offset, depth); if (kUseHexDump) {
static const char kWhitespace[] = " ";
const char *indent = &kWhitespace[sizeof(kWhitespace) - 1 - 2 * depth];
printf("%sfound chunk '%s' of size %" PRIu64 "\n", indent, chunk, chunk_size); char buffer[256];
size_t n = chunk_size;
if (n > sizeof(buffer)) {
n = sizeof(buffer);
}
if (mDataSource->readAt(*offset, buffer, n)
< (ssize_t)n) {
return ERROR_IO;
} hexdump(buffer, n);
} PathAdder autoAdder(&mPath, chunk_type); // (data_offset - *offset) is either 8 or 16
// 计算box中的数据的长度,data_offset为读取的位置,offset为起始位置
off64_t chunk_data_size = chunk_size - (data_offset - *offset);
if (chunk_data_size < 0) {
ALOGE("b/23540914");
return ERROR_MALFORMED;
} // 检查box的大小,如果不是mdat,但是其数据大小超过一定范围说明这个box存在问题
if (chunk_type != FOURCC("mdat") && chunk_data_size > kMaxAtomSize) {
char errMsg[100];
sprintf(errMsg, "%s atom has size %" PRId64, chunk, chunk_data_size);
ALOGE("%s (b/28615448)", errMsg);
android_errorWriteWithInfoLog(0x534e4554, "28615448", -1, errMsg, strlen(errMsg));
return ERROR_MALFORMED;
} // 不去研究这个box
if (chunk_type != FOURCC("cprt")
&& chunk_type != FOURCC("covr")
&& mPath.size() == 5 && underMetaDataPath(mPath)) {
off64_t stop_offset = *offset + chunk_size;
*offset = data_offset;
while (*offset < stop_offset) {
status_t err = parseChunk(offset, depth + 1);
if (err != OK) {
return err;
}
} if (*offset != stop_offset) {
return ERROR_MALFORMED;
} return OK;
} switch(chunk_type) {
case FOURCC("moov"):
case FOURCC("trak"):
case FOURCC("mdia"):
case FOURCC("minf"):
case FOURCC("dinf"):
case FOURCC("stbl"):
case FOURCC("mvex"):
case FOURCC("moof"):
case FOURCC("traf"):
case FOURCC("mfra"):
case FOURCC("udta"):
case FOURCC("ilst"):
case FOURCC("sinf"):
case FOURCC("schi"):
case FOURCC("edts"):
case FOURCC("wave"):
{
// 如果是moov box,但是其深度不为0,意思是moov box在一个container box中,那么就报错
if (chunk_type == FOURCC("moov") && depth != 0) {
ALOGE("moov: depth %d", depth);
return ERROR_MALFORMED;
}
// 如果是moov box,但是已经初始化完毕了,说明前面已经解析过一个moov了,那也是不对的
if (chunk_type == FOURCC("moov") && mInitCheck == OK) {
ALOGE("duplicate moov");
return ERROR_MALFORMED;
} if (chunk_type == FOURCC("moof") && !mMoofFound) {
// store the offset of the first segment
mMoofFound = true;
mMoofOffset = *offset;
} if (chunk_type == FOURCC("stbl")) {
ALOGV("sampleTable chunk is %" PRIu64 " bytes long.", chunk_size); if (mDataSource->flags()
& (DataSourceBase::kWantsPrefetching
| DataSourceBase::kIsCachingDataSource)) {
CachedRangedDataSource *cachedSource =
new CachedRangedDataSource(mDataSource); if (cachedSource->setCachedRange(
*offset, chunk_size,
true /* assume ownership on success */) == OK) {
mDataSource = cachedSource;
} else {
delete cachedSource;
}
} if (mLastTrack == NULL) {
return ERROR_MALFORMED;
}
// 扫描到stbl之后为Track创建一个SampleTable,后面来看这个SampleTable做什么用的
mLastTrack->sampleTable = new SampleTable(mDataSource);
} bool isTrack = false;
if (chunk_type == FOURCC("trak")) {
if (depth != 1) {
ALOGE("trak: depth %d", depth);
return ERROR_MALFORMED;
}
isTrack = true;
// 扫描到trak box,则在Track链表上添加一个节点
ALOGV("adding new track");
Track *track = new Track;
if (mLastTrack) {
mLastTrack->next = track;
} else {
mFirstTrack = track;
}
mLastTrack = track; track->meta = AMediaFormat_new();
// 给track设置一个默认的mime
AMediaFormat_setString(track->meta,
AMEDIAFORMAT_KEY_MIME, "application/octet-stream");
} // 上面的box type都是conatiner box,这里会去递归解析子box
off64_t stop_offset = *offset + chunk_size;
*offset = data_offset; // 子box的起始位置起始就是原先的起始位置 + box header length(8)
while (*offset < stop_offset) { // pass udata terminate
if (mIsQT && stop_offset - *offset == 4 && chunk_type == FOURCC("udta")) {
// handle the case that udta terminates with terminate code x00000000
// note that 0 terminator is optional and we just handle this case.
uint32_t terminate_code = 1;
mDataSource->readAt(*offset, &terminate_code, 4);
if (0 == terminate_code) {
*offset += 4;
ALOGD("Terminal code for udta");
continue;
} else {
ALOGW("invalid udta Terminal code");
}
}
// 递归去parse
status_t err = parseChunk(offset, depth + 1);
if (err != OK) {
if (isTrack) {
mLastTrack->skipTrack = true;
break;
}
return err;
}
} if (*offset != stop_offset) {
return ERROR_MALFORMED;
} // 递归解析结束之后,如果是解析的trak box,那就要整理解析的内容到Track当中
if (isTrack) {
int32_t trackId;
// There must be exactly one track header per track.
// 如果track没有trackid,那么将当前track置为skip
if (!AMediaFormat_getInt32(mLastTrack->meta,
AMEDIAFORMAT_KEY_TRACK_ID, &trackId)) {
mLastTrack->skipTrack = true;
} status_t err = verifyTrack(mLastTrack);
if (err != OK) {
mLastTrack->skipTrack = true;
} // skipTrack被置为true说明该track无效,会从链表中删除该Track
if (mLastTrack->skipTrack) {
ALOGV("skipping this track...");
Track *cur = mFirstTrack; if (cur == mLastTrack) {
delete cur;
mFirstTrack = mLastTrack = NULL;
} else {
while (cur && cur->next != mLastTrack) {
cur = cur->next;
}
if (cur) {
cur->next = NULL;
}
delete mLastTrack;
mLastTrack = cur;
} return OK;
} // place things we built elsewhere into their final locations // put aggregated tx3g data into the metadata
if (mLastTrack->mTx3gFilled > 0) {
ALOGV("Putting %zu bytes of tx3g data into meta data",
mLastTrack->mTx3gFilled);
AMediaFormat_setBuffer(mLastTrack->meta,
AMEDIAFORMAT_KEY_TEXT_FORMAT_DATA,
mLastTrack->mTx3gBuffer, mLastTrack->mTx3gFilled);
// drop it now to reduce our footprint
free(mLastTrack->mTx3gBuffer);
mLastTrack->mTx3gBuffer = NULL;
mLastTrack->mTx3gFilled = 0;
mLastTrack->mTx3gSize = 0;
} const char *mime;
AMediaFormat_getString(mLastTrack->meta, AMEDIAFORMAT_KEY_MIME, &mime);
// 判断mime是否为Video_dobly_vision,后面的暂时就不看了
if (!strcasecmp(mime, MEDIA_MIMETYPE_VIDEO_DOLBY_VISION)) {
void *data;
size_t size; if (AMediaFormat_getBuffer(mLastTrack->meta, AMEDIAFORMAT_KEY_CSD_2, &data, &size)) {
const uint8_t *ptr = (const uint8_t *)data;
const uint8_t profile = ptr[2] >> 1;
const uint8_t bl_compatibility_id = (ptr[4]) >> 4;
bool create_two_tracks = false; if (bl_compatibility_id && bl_compatibility_id != 15) {
create_two_tracks = true;
} if (4 == profile || 7 == profile ||
(profile >= 8 && profile < 11 && create_two_tracks)) {
// we need a backward compatible track
ALOGV("Adding new backward compatible track");
Track *track_b = new Track; track_b->timescale = mLastTrack->timescale;
track_b->sampleTable = mLastTrack->sampleTable;
track_b->includes_expensive_metadata = mLastTrack->includes_expensive_metadata;
track_b->skipTrack = mLastTrack->skipTrack;
track_b->elst_needs_processing = mLastTrack->elst_needs_processing;
track_b->elst_media_time = mLastTrack->elst_media_time;
track_b->elst_segment_duration = mLastTrack->elst_segment_duration;
track_b->elst_shift_start_ticks = mLastTrack->elst_shift_start_ticks;
track_b->elst_initial_empty_edit_ticks = mLastTrack->elst_initial_empty_edit_ticks;
track_b->subsample_encryption = mLastTrack->subsample_encryption; track_b->mTx3gBuffer = mLastTrack->mTx3gBuffer;
track_b->mTx3gSize = mLastTrack->mTx3gSize;
track_b->mTx3gFilled = mLastTrack->mTx3gFilled; track_b->meta = AMediaFormat_new();
AMediaFormat_copy(track_b->meta, mLastTrack->meta); mLastTrack->next = track_b;
track_b->next = NULL; auto id = track_b->meta->mFormat->findEntryByName(AMEDIAFORMAT_KEY_CSD_2);
track_b->meta->mFormat->removeEntryAt(id); if (4 == profile || 7 == profile || 8 == profile ) {
AMediaFormat_setString(track_b->meta,
AMEDIAFORMAT_KEY_MIME, MEDIA_MIMETYPE_VIDEO_HEVC);
} else if (9 == profile) {
AMediaFormat_setString(track_b->meta,
AMEDIAFORMAT_KEY_MIME, MEDIA_MIMETYPE_VIDEO_AVC);
} else if (10 == profile) {
AMediaFormat_setString(track_b->meta,
AMEDIAFORMAT_KEY_MIME, MEDIA_MIMETYPE_VIDEO_AV1);
} // Should never get to else part mLastTrack = track_b;
}
}
}
} else if (chunk_type == FOURCC("moov")) {
// 如果当前递归扫描的是moov box,那么将mInitCheck置为true
mInitCheck = OK; return UNKNOWN_ERROR; // Return a dummy error.
}
break;
}
// 暂时不研究这个,应该是用于加密视频播放
case FOURCC("schm"):
{ *offset += chunk_size;
if (!mLastTrack) {
return ERROR_MALFORMED;
} uint32_t scheme_type;
if (mDataSource->readAt(data_offset + 4, &scheme_type, 4) < 4) {
return ERROR_IO;
}
scheme_type = ntohl(scheme_type);
int32_t mode = kCryptoModeUnencrypted;
switch(scheme_type) {
case FOURCC("cbc1"):
{
mode = kCryptoModeAesCbc;
break;
}
case FOURCC("cbcs"):
{
mode = kCryptoModeAesCbc;
mLastTrack->subsample_encryption = true;
break;
}
case FOURCC("cenc"):
{
mode = kCryptoModeAesCtr;
break;
}
case FOURCC("cens"):
{
mode = kCryptoModeAesCtr;
mLastTrack->subsample_encryption = true;
break;
}
}
if (mode != kCryptoModeUnencrypted) {
AMediaFormat_setInt32(mLastTrack->meta, AMEDIAFORMAT_KEY_CRYPTO_MODE, mode);
}
break;
} // elst这个box 保存有视频的起始时间
case FOURCC("elst"):
{
*offset += chunk_size; if (!mLastTrack) {
return ERROR_MALFORMED;
} // 读取版本信息
// See 14496-12 8.6.6
uint8_t version;
if (mDataSource->readAt(data_offset, &version, 1) < 1) {
return ERROR_IO;
} // 读取box中内容条数
uint32_t entry_count;
if (!mDataSource->getUInt32(data_offset + 4, &entry_count)) {
return ERROR_IO;
} if (entry_count > 2) {
/* We support a single entry for gapless playback or negating offset for
* reordering B frames, two entries (empty edit) for start offset at the moment.
*/
ALOGW("ignoring edit list with %d entries", entry_count);
} else {
off64_t entriesoffset = data_offset + 8;
uint64_t segment_duration;
int64_t media_time;
bool empty_edit_present = false;
for (int i = 0; i < entry_count; ++i) {
switch (version) {
// 这里只看version为0的版本
case 0: {
uint32_t sd;
int32_t mt;
// 读取segment_duration,应该就是track的时长
// 读取media_time,为流的起始时间用于计算DTS和PTS
if (!mDataSource->getUInt32(entriesoffset, &sd) ||
!mDataSource->getUInt32(entriesoffset + 4, (uint32_t*)&mt)) {
return ERROR_IO;
}
segment_duration = sd;
media_time = mt;
// 4(segment duration) + 4(media time) + 4(media rate)
entriesoffset += 12;
break;
}
case 1: {
if (!mDataSource->getUInt64(entriesoffset, &segment_duration) ||
!mDataSource->getUInt64(entriesoffset + 8, (uint64_t*)&media_time)) {
return ERROR_IO;
}
// 8(segment duration) + 8(media time) + 4(media rate)
entriesoffset += 20;
break;
}
default:
return ERROR_IO;
break;
}
// Empty edit entry would have to be first entry.
if (media_time == -1 && i == 0) {
empty_edit_present = true;
ALOGV("initial empty edit ticks: %" PRIu64, segment_duration);
/* In movie header timescale, and needs to be converted to media timescale
* after we get that from a track's 'mdhd' atom,
* which at times come after 'elst'.
*/
mLastTrack->elst_initial_empty_edit_ticks = segment_duration;
} else if (media_time >= 0 && i == 0) {
ALOGV("first edit list entry - from gapless playback files");
// 保存elst信息到Track当中
mLastTrack->elst_media_time = media_time;
mLastTrack->elst_segment_duration = segment_duration;
ALOGV("segment_duration: %" PRIu64 " media_time: %" PRId64,
segment_duration, media_time);
// media_time is in media timescale as are STTS/CTTS entries.
mLastTrack->elst_shift_start_ticks = media_time;
} else if (empty_edit_present && i == 1) {
// Process second entry only when the first entry was an empty edit entry.
ALOGV("second edit list entry");
mLastTrack->elst_shift_start_ticks = media_time;
} else {
ALOGW("for now, unsupported entry in edit list %" PRIu32, entry_count);
}
}
// save these for later, because the elst atom might precede
// the atoms that actually gives us the duration and sample rate
// needed to calculate the padding and delay values
mLastTrack->elst_needs_processing = true;
}
break;
}
// 如果有frmabox
case FOURCC("frma"):
{
*offset += chunk_size; uint32_t original_fourcc;
if (mDataSource->readAt(data_offset, &original_fourcc, 4) < 4) {
return ERROR_IO;
}
original_fourcc = ntohl(original_fourcc);
ALOGV("read original format: %d", original_fourcc); if (mLastTrack == NULL) {
return ERROR_MALFORMED;
}
// 设定track的mime
AMediaFormat_setString(mLastTrack->meta,
AMEDIAFORMAT_KEY_MIME, FourCC2MIME(original_fourcc));
uint32_t num_channels = 0;
uint32_t sample_rate = 0;
if (AdjustChannelsAndRate(original_fourcc, &num_channels, &sample_rate)) {
AMediaFormat_setInt32(mLastTrack->meta,
AMEDIAFORMAT_KEY_CHANNEL_COUNT, num_channels);
AMediaFormat_setInt32(mLastTrack->meta,
AMEDIAFORMAT_KEY_SAMPLE_RATE, sample_rate);
} if (!mIsQT && original_fourcc == FOURCC("alac")) {
off64_t tmpOffset = *offset;
status_t err = parseALACSampleEntry(&tmpOffset);
if (err != OK) {
ALOGE("parseALACSampleEntry err:%d Line:%d", err, __LINE__);
return err;
}
*offset = tmpOffset + 8;
} break;
} // ...... // 解析track header
case FOURCC("tkhd"):
{
*offset += chunk_size; status_t err;
// 主要用来解析track id,video track的width、height,并且保存在meta data中
if ((err = parseTrackHeader(data_offset, chunk_data_size)) != OK) {
return err;
} break;
} // ...... // 解析mdhd
case FOURCC("mdhd"):
{
*offset += chunk_size; if (chunk_data_size < 4 || mLastTrack == NULL) {
return ERROR_MALFORMED;
} uint8_t version;
if (mDataSource->readAt(
data_offset, &version, sizeof(version))
< (ssize_t)sizeof(version)) {
return ERROR_IO;
} off64_t timescale_offset; if (version == 1) {
timescale_offset = data_offset + 4 + 16;
} else if (version == 0) {
timescale_offset = data_offset + 4 + 8;
} else {
return ERROR_IO;
} // 读取timescale
uint32_t timescale;
if (mDataSource->readAt(
timescale_offset, &timescale, sizeof(timescale))
< (ssize_t)sizeof(timescale)) {
return ERROR_IO;
} if (!timescale) {
ALOGE("timescale should not be ZERO.");
return ERROR_MALFORMED;
} // 将timescale保存到track中
mLastTrack->timescale = ntohl(timescale); // 14496-12 says all ones means indeterminate, but some files seem to use
// 0 instead. We treat both the same.
int64_t duration = 0;
if (version == 1) {
if (mDataSource->readAt(
timescale_offset + 4, &duration, sizeof(duration))
< (ssize_t)sizeof(duration)) {
return ERROR_IO;
}
if (duration != -1) {
duration = ntoh64(duration);
}
} else {
// 这里只看version为0的版本
uint32_t duration32;
// 读取当前track的duration
if (mDataSource->readAt(
timescale_offset + 4, &duration32, sizeof(duration32))
< (ssize_t)sizeof(duration32)) {
return ERROR_IO;
}
if (duration32 != 0xffffffff) {
duration = ntohl(duration32);
}
}
if (duration != 0 && mLastTrack->timescale != 0) {
// 真正的duration需要用这边获取的duration除以timescale
long double durationUs = ((long double)duration * 1000000) / mLastTrack->timescale;
if (durationUs < 0 || durationUs > INT64_MAX) {
ALOGE("cannot represent %lld * 1000000 / %lld in 64 bits",
(long long) duration, (long long) mLastTrack->timescale);
return ERROR_MALFORMED;
}
// 设置给meta的duration是用的微秒
AMediaFormat_setInt64(mLastTrack->meta, AMEDIAFORMAT_KEY_DURATION, durationUs);
} uint8_t lang[2];
off64_t lang_offset;
if (version == 1) {
lang_offset = timescale_offset + 4 + 8;
} else if (version == 0) {
lang_offset = timescale_offset + 4 + 4;
} else {
return ERROR_IO;
} if (mDataSource->readAt(lang_offset, &lang, sizeof(lang))
< (ssize_t)sizeof(lang)) {
return ERROR_IO;
} // To get the ISO-639-2/T three character language code
// 1 bit pad followed by 3 5-bits characters. Each character
// is packed as the difference between its ASCII value and 0x60.
char lang_code[4];
lang_code[0] = ((lang[0] >> 2) & 0x1f) + 0x60;
lang_code[1] = ((lang[0] & 0x3) << 3 | (lang[1] >> 5)) + 0x60;
lang_code[2] = (lang[1] & 0x1f) + 0x60;
lang_code[3] = '\0';
// 给meta设置key language
AMediaFormat_setString(mLastTrack->meta, AMEDIAFORMAT_KEY_LANGUAGE, lang_code); break;
} // 非常中要的box,子box可以解析出mime
case FOURCC("stsd"):
{
uint8_t buffer[8];
if (chunk_data_size < (off64_t)sizeof(buffer)) {
return ERROR_MALFORMED;
} if (mDataSource->readAt(
data_offset, buffer, 8) < 8) {
return ERROR_IO;
} if (U32_AT(buffer) != 0) {
// Should be version 0, flags 0.
return ERROR_MALFORMED;
} uint32_t entry_count = U32_AT(&buffer[4]); if (entry_count > 1) {
// For 3GPP timed text, there could be multiple tx3g boxes contain
// multiple text display formats. These formats will be used to
// display the timed text.
// For encrypted files, there may also be more than one entry.
const char *mime; if (mLastTrack == NULL)
return ERROR_MALFORMED; CHECK(AMediaFormat_getString(mLastTrack->meta, AMEDIAFORMAT_KEY_MIME, &mime));
if (strcasecmp(mime, MEDIA_MIMETYPE_TEXT_3GPP) &&
strcasecmp(mime, "application/octet-stream")) {
// For now we only support a single type of media per track.
mLastTrack->skipTrack = true;
*offset += chunk_size;
break;
}
}
off64_t stop_offset = *offset + chunk_size;
*offset = data_offset + 8;
for (uint32_t i = 0; i < entry_count; ++i) {
// 递归parse子box,可以解析出mime type
status_t err = parseChunk(offset, depth + 1);
if (err != OK) {
return err;
}
} if (*offset != stop_offset) {
return ERROR_MALFORMED;
}
break;
} // stsd子box type如果是以下内容,说明是audio track
case FOURCC("mp4a"):
case FOURCC("enca"):
case FOURCC("samr"):
case FOURCC("sawb"):
case FOURCC("Opus"):
case FOURCC("twos"):
case FOURCC("sowt"):
case FOURCC("alac"):
case FOURCC("fLaC"):
case FOURCC(".mp3"):
case 0x6D730055: // "ms U" mp3 audio
{
if (mIsQT && depth >= 1 && mPath[depth - 1] == FOURCC("wave")) { if (chunk_type == FOURCC("alac")) {
off64_t offsetTmp = *offset;
status_t err = parseALACSampleEntry(&offsetTmp);
if (err != OK) {
ALOGE("parseALACSampleEntry err:%d Line:%d", err, __LINE__);
return err;
}
} // Ignore all atoms embedded in QT wave atom
ALOGV("Ignore all atoms embedded in QT wave atom");
*offset += chunk_size;
break;
} uint8_t buffer[8 + 20];
if (chunk_data_size < (ssize_t)sizeof(buffer)) {
// Basic AudioSampleEntry size.
return ERROR_MALFORMED;
} if (mDataSource->readAt(
data_offset, buffer, sizeof(buffer)) < (ssize_t)sizeof(buffer)) {
return ERROR_IO;
} uint16_t data_ref_index __unused = U16_AT(&buffer[6]);
uint16_t version = U16_AT(&buffer[8]);
uint32_t num_channels = U16_AT(&buffer[16]); uint16_t sample_size = U16_AT(&buffer[18]);
uint32_t sample_rate = U32_AT(&buffer[24]) >> 16; if (mLastTrack == NULL)
return ERROR_MALFORMED; off64_t stop_offset = *offset + chunk_size;
*offset = data_offset + sizeof(buffer); if (mIsQT) {
if (version == 1) {
if (mDataSource->readAt(*offset, buffer, 16) < 16) {
return ERROR_IO;
} #if 0
U32_AT(buffer); // samples per packet
U32_AT(&buffer[4]); // bytes per packet
U32_AT(&buffer[8]); // bytes per frame
U32_AT(&buffer[12]); // bytes per sample
#endif
*offset += 16;
} else if (version == 2) {
uint8_t v2buffer[36];
if (mDataSource->readAt(*offset, v2buffer, 36) < 36) {
return ERROR_IO;
} #if 0
U32_AT(v2buffer); // size of struct only
sample_rate = (uint32_t)U64_AT(&v2buffer[4]); // audio sample rate
num_channels = U32_AT(&v2buffer[12]); // num audio channels
U32_AT(&v2buffer[16]); // always 0x7f000000
sample_size = (uint16_t)U32_AT(&v2buffer[20]); // const bits per channel
U32_AT(&v2buffer[24]); // format specifc flags
U32_AT(&v2buffer[28]); // const bytes per audio packet
U32_AT(&v2buffer[32]); // const LPCM frames per audio packet
#endif
*offset += 36;
}
} if (chunk_type != FOURCC("enca")) {
// if the chunk type is enca, we'll get the type from the frma box later
AMediaFormat_setString(mLastTrack->meta,
AMEDIAFORMAT_KEY_MIME, FourCC2MIME(chunk_type));
AdjustChannelsAndRate(chunk_type, &num_channels, &sample_rate); if (!strcasecmp(MEDIA_MIMETYPE_AUDIO_RAW, FourCC2MIME(chunk_type))) {
AMediaFormat_setInt32(mLastTrack->meta,
AMEDIAFORMAT_KEY_BITS_PER_SAMPLE, sample_size);
if (chunk_type == FOURCC("twos")) {
AMediaFormat_setInt32(mLastTrack->meta,
AMEDIAFORMAT_KEY_PCM_BIG_ENDIAN, 1);
}
}
} // 将读取出的sample size和sample rate保存到meta当中
ALOGV("*** coding='%s' %d channels, size %d, rate %d\n",
chunk, num_channels, sample_size, sample_rate);
AMediaFormat_setInt32(mLastTrack->meta, AMEDIAFORMAT_KEY_CHANNEL_COUNT, num_channels);
AMediaFormat_setInt32(mLastTrack->meta, AMEDIAFORMAT_KEY_SAMPLE_RATE, sample_rate); // ...... if (!mIsQT && chunk_type == FOURCC("alac")) {
data_offset += sizeof(buffer); status_t err = parseALACSampleEntry(&data_offset);
if (err != OK) {
ALOGE("parseALACSampleEntry err:%d Line:%d", err, __LINE__);
return err;
}
*offset = data_offset;
CHECK_EQ(*offset, stop_offset);
} if (chunk_type == FOURCC("fLaC")) { // From https://github.com/xiph/flac/blob/master/doc/isoflac.txt
// 4 for mime, 4 for blockType and BlockLen, 34 for metadata
uint8_t flacInfo[4 + 4 + 34];
// skipping dFla, version
data_offset += sizeof(buffer) + 12;
size_t flacOffset = 4;
// Add flaC header mime type to CSD
strncpy((char *)flacInfo, "fLaC", 4);
if (mDataSource->readAt(
data_offset, flacInfo + flacOffset, sizeof(flacInfo) - flacOffset) <
(ssize_t)sizeof(flacInfo) - flacOffset) {
return ERROR_IO;
}
data_offset += sizeof(flacInfo) - flacOffset; AMediaFormat_setBuffer(mLastTrack->meta, AMEDIAFORMAT_KEY_CSD_0, flacInfo,
sizeof(flacInfo));
*offset = data_offset;
CHECK_EQ(*offset, stop_offset);
} while (*offset < stop_offset) {
// 继续递归子box
status_t err = parseChunk(offset, depth + 1);
if (err != OK) {
return err;
}
} if (*offset != stop_offset) {
return ERROR_MALFORMED;
}
break;
} // 如果box type是以下内容,那么说明当前track为video track
case FOURCC("mp4v"):
case FOURCC("encv"):
case FOURCC("s263"):
case FOURCC("H263"):
case FOURCC("h263"):
case FOURCC("avc1"):
case FOURCC("hvc1"):
case FOURCC("hev1"):
case FOURCC("dvav"):
case FOURCC("dva1"):
case FOURCC("dvhe"):
case FOURCC("dvh1"):
case FOURCC("dav1"):
case FOURCC("av01"):
{
uint8_t buffer[78];
if (chunk_data_size < (ssize_t)sizeof(buffer)) {
// Basic VideoSampleEntry size.
return ERROR_MALFORMED;
} if (mDataSource->readAt(
data_offset, buffer, sizeof(buffer)) < (ssize_t)sizeof(buffer)) {
return ERROR_IO;
} uint16_t data_ref_index __unused = U16_AT(&buffer[6]);
uint16_t width = U16_AT(&buffer[6 + 18]);
uint16_t height = U16_AT(&buffer[6 + 20]); // The video sample is not standard-compliant if it has invalid dimension.
// Use some default width and height value, and
// let the decoder figure out the actual width and height (and thus
// be prepared for INFO_FOMRAT_CHANGED event).
if (width == 0) width = 352;
if (height == 0) height = 288; // printf("*** coding='%s' width=%d height=%d\n",
// chunk, width, height); if (mLastTrack == NULL)
return ERROR_MALFORMED; if (chunk_type != FOURCC("encv")) {
// if the chunk type is encv, we'll get the type from the frma box later
AMediaFormat_setString(mLastTrack->meta,
AMEDIAFORMAT_KEY_MIME, FourCC2MIME(chunk_type));
}
// 同样可以解析出视频的宽高,并且将他们设置到meta当中
AMediaFormat_setInt32(mLastTrack->meta, AMEDIAFORMAT_KEY_WIDTH, width);
AMediaFormat_setInt32(mLastTrack->meta, AMEDIAFORMAT_KEY_HEIGHT, height); off64_t stop_offset = *offset + chunk_size;
*offset = data_offset + sizeof(buffer);
while (*offset < stop_offset) {
// 继续parse子box
status_t err = parseChunk(offset, depth + 1);
if (err != OK) {
return err;
}
} if (*offset != stop_offset) {
return ERROR_MALFORMED;
}
break;
} // 解析stco,这里面存储的是trunk在mtdt中的偏移量
case FOURCC("stco"):
case FOURCC("co64"):
{
if ((mLastTrack == NULL) || (mLastTrack->sampleTable == NULL)) {
return ERROR_MALFORMED;
} // 设置chunk offset的参数,当时创建sampleTable时,是直接将包含stbl box在内的剩余数据全部拷贝到了sample table当中
status_t err =
mLastTrack->sampleTable->setChunkOffsetParams(
chunk_type, data_offset, chunk_data_size); *offset += chunk_size; if (err != OK) {
return err;
} break;
} case FOURCC("stsc"):
{
if ((mLastTrack == NULL) || (mLastTrack->sampleTable == NULL))
return ERROR_MALFORMED; // 设置stsc的相关数据区域
status_t err =
mLastTrack->sampleTable->setSampleToChunkParams(
data_offset, chunk_data_size); *offset += chunk_size; if (err != OK) {
return err;
} break;
} case FOURCC("stsz"):
case FOURCC("stz2"):
{
if ((mLastTrack == NULL) || (mLastTrack->sampleTable == NULL)) {
return ERROR_MALFORMED;
}
// 设置stsz的数据区域
status_t err =
mLastTrack->sampleTable->setSampleSizeParams(
chunk_type, data_offset, chunk_data_size); *offset += chunk_size; if (err != OK) {
return err;
} adjustRawDefaultFrameSize(); size_t max_size;
err = mLastTrack->sampleTable->getMaxSampleSize(&max_size); if (err != OK) {
return err;
} if (max_size != 0) {
// Assume that a given buffer only contains at most 10 chunks,
// each chunk originally prefixed with a 2 byte length will
// have a 4 byte header (0x00 0x00 0x00 0x01) after conversion,
// and thus will grow by 2 bytes per chunk.
if (max_size > SIZE_MAX - 10 * 2) {
ALOGE("max sample size too big: %zu", max_size);
return ERROR_MALFORMED;
}
AMediaFormat_setInt32(mLastTrack->meta,
AMEDIAFORMAT_KEY_MAX_INPUT_SIZE, max_size + 10 * 2);
} else {
// No size was specified. Pick a conservatively large size.
uint32_t width, height;
if (!AMediaFormat_getInt32(mLastTrack->meta,
AMEDIAFORMAT_KEY_WIDTH, (int32_t*)&width) ||
!AMediaFormat_getInt32(mLastTrack->meta,
AMEDIAFORMAT_KEY_HEIGHT,(int32_t*) &height)) {
ALOGE("No width or height, assuming worst case 1080p");
width = 1920;
height = 1080;
} else {
// A resolution was specified, check that it's not too big. The values below
// were chosen so that the calculations below don't cause overflows, they're
// not indicating that resolutions up to 32kx32k are actually supported.
if (width > 32768 || height > 32768) {
ALOGE("can't support %u x %u video", width, height);
return ERROR_MALFORMED;
}
} const char *mime;
CHECK(AMediaFormat_getString(mLastTrack->meta, AMEDIAFORMAT_KEY_MIME, &mime));
if (!strncmp(mime, "audio/", 6)) {
// for audio, use 128KB
max_size = 1024 * 128;
} else if (!strcmp(mime, MEDIA_MIMETYPE_VIDEO_AVC)
|| !strcmp(mime, MEDIA_MIMETYPE_VIDEO_HEVC)
|| !strcmp(mime, MEDIA_MIMETYPE_VIDEO_DOLBY_VISION)) {
// AVC & HEVC requires compression ratio of at least 2, and uses
// macroblocks
max_size = ((width + 15) / 16) * ((height + 15) / 16) * 192;
} else {
// For all other formats there is no minimum compression
// ratio. Use compression ratio of 1.
max_size = width * height * 3 / 2;
}
// HACK: allow 10% overhead
// TODO: read sample size from traf atom for fragmented MPEG4.
max_size += max_size / 10;
// 设定最大的buffer输入大小
AMediaFormat_setInt32(mLastTrack->meta, AMEDIAFORMAT_KEY_MAX_INPUT_SIZE, max_size);
} // NOTE: setting another piece of metadata invalidates any pointers (such as the
// mimetype) previously obtained, so don't cache them.
const char *mime;
CHECK(AMediaFormat_getString(mLastTrack->meta, AMEDIAFORMAT_KEY_MIME, &mime));
// Calculate average frame rate.
if (!strncasecmp("video/", mime, 6)) {
size_t nSamples = mLastTrack->sampleTable->countSamples();
if (nSamples == 0) {
int32_t trackId;
if (AMediaFormat_getInt32(mLastTrack->meta,
AMEDIAFORMAT_KEY_TRACK_ID, &trackId)) {
for (size_t i = 0; i < mTrex.size(); i++) {
Trex *t = &mTrex.editItemAt(i);
if (t->track_ID == (uint32_t) trackId) {
if (t->default_sample_duration > 0) {
int32_t frameRate =
mLastTrack->timescale / t->default_sample_duration;
AMediaFormat_setInt32(mLastTrack->meta,
AMEDIAFORMAT_KEY_FRAME_RATE, frameRate);
}
break;
}
}
}
} else {
int64_t durationUs;
if (AMediaFormat_getInt64(mLastTrack->meta,
AMEDIAFORMAT_KEY_DURATION, &durationUs)) {
if (durationUs > 0) {
int32_t frameRate = (nSamples * 1000000LL +
(durationUs >> 1)) / durationUs;
// 给meta设置帧率
AMediaFormat_setInt32(mLastTrack->meta,
AMEDIAFORMAT_KEY_FRAME_RATE, frameRate);
}
}
ALOGV("setting frame count %zu", nSamples);
// 给meta设置帧数量
AMediaFormat_setInt32(mLastTrack->meta,
AMEDIAFORMAT_KEY_FRAME_COUNT, nSamples);
}
} break;
} case FOURCC("stts"):
{
if ((mLastTrack == NULL) || (mLastTrack->sampleTable == NULL))
return ERROR_MALFORMED; *offset += chunk_size; if (depth >= 1 && mPath[depth - 1] != FOURCC("stbl")) {
char chunk[5];
MakeFourCCString(mPath[depth - 1], chunk);
ALOGW("stts's parent box (%s) is not stbl, skip it.", chunk);
break;
} status_t err =
mLastTrack->sampleTable->setTimeToSampleParams(
data_offset, chunk_data_size); if (err != OK) {
return err;
} break;
} case FOURCC("ctts"):
{
if ((mLastTrack == NULL) || (mLastTrack->sampleTable == NULL))
return ERROR_MALFORMED; *offset += chunk_size; status_t err =
mLastTrack->sampleTable->setCompositionTimeToSampleParams(
data_offset, chunk_data_size); if (err != OK) {
return err;
} break;
} case FOURCC("stss"):
{
if ((mLastTrack == NULL) || (mLastTrack->sampleTable == NULL))
return ERROR_MALFORMED; *offset += chunk_size; status_t err =
mLastTrack->sampleTable->setSyncSampleParams(
data_offset, chunk_data_size); if (err != OK) {
return err;
} break;
} // ...... // 如果avc1的子box是avcC,那么可以解析出sps pps信息
case FOURCC("avcC"):
{
*offset += chunk_size; auto buffer = heapbuffer<uint8_t>(chunk_data_size); if (buffer.get() == NULL) {
ALOGE("b/28471206");
return NO_MEMORY;
} if (mDataSource->readAt(
data_offset, buffer.get(), chunk_data_size) < chunk_data_size) {
return ERROR_IO;
} if (mLastTrack == NULL)
return ERROR_MALFORMED; // 将读取到的buffer作为csd buffer
AMediaFormat_setBuffer(mLastTrack->meta,
AMEDIAFORMAT_KEY_CSD_AVC, buffer.get(), chunk_data_size); break;
}
case FOURCC("hvcC"):
{
auto buffer = heapbuffer<uint8_t>(chunk_data_size); if (buffer.get() == NULL) {
ALOGE("b/28471206");
return NO_MEMORY;
} if (mDataSource->readAt(
data_offset, buffer.get(), chunk_data_size) < chunk_data_size) {
return ERROR_IO;
} if (mLastTrack == NULL)
return ERROR_MALFORMED;
// 同样的,如果是hevc,也去读取vps sps pps信息作为csd buffer,存储到meta中
AMediaFormat_setBuffer(mLastTrack->meta,
AMEDIAFORMAT_KEY_CSD_HEVC, buffer.get(), chunk_data_size); *offset += chunk_size;
break;
}
case FOURCC("av1C"):
{
auto buffer = heapbuffer<uint8_t>(chunk_data_size); if (buffer.get() == NULL) {
ALOGE("b/28471206");
return NO_MEMORY;
} if (mDataSource->readAt(
data_offset, buffer.get(), chunk_data_size) < chunk_data_size) {
return ERROR_IO;
} if (mLastTrack == NULL)
return ERROR_MALFORMED; AMediaFormat_setBuffer(mLastTrack->meta,
AMEDIAFORMAT_KEY_CSD_0, buffer.get(), chunk_data_size); *offset += chunk_size;
break;
}
// 杜比相关内容
case FOURCC("dvcC"):
case FOURCC("dvvC"): { CHECK_EQ(chunk_data_size, 24); auto buffer = heapbuffer<uint8_t>(chunk_data_size); if (buffer.get() == NULL) {
ALOGE("b/28471206");
return NO_MEMORY;
} if (mDataSource->readAt(data_offset, buffer.get(), chunk_data_size) < chunk_data_size) {
return ERROR_IO;
} if (mLastTrack == NULL)
return ERROR_MALFORMED; AMediaFormat_setBuffer(mLastTrack->meta, AMEDIAFORMAT_KEY_CSD_2,
buffer.get(), chunk_data_size);
AMediaFormat_setString(mLastTrack->meta, AMEDIAFORMAT_KEY_MIME,
MEDIA_MIMETYPE_VIDEO_DOLBY_VISION); *offset += chunk_size;
break;
} // ...... // mvhd中解析出的是文件的元信息
case FOURCC("mvhd"):
{
*offset += chunk_size; if (depth != 1) {
ALOGE("mvhd: depth %d", depth);
return ERROR_MALFORMED;
}
if (chunk_data_size < 32) {
return ERROR_MALFORMED;
} uint8_t header[32];
if (mDataSource->readAt(
data_offset, header, sizeof(header))
< (ssize_t)sizeof(header)) {
return ERROR_IO;
} uint64_t creationTime;
uint64_t duration = 0;
if (header[0] == 1) {
creationTime = U64_AT(&header[4]);
mHeaderTimescale = U32_AT(&header[20]);
duration = U64_AT(&header[24]);
if (duration == 0xffffffffffffffff) {
duration = 0;
}
} else if (header[0] != 0) {
return ERROR_MALFORMED;
} else {
creationTime = U32_AT(&header[4]);
mHeaderTimescale = U32_AT(&header[12]);
uint32_t d32 = U32_AT(&header[16]);
if (d32 == 0xffffffff) {
d32 = 0;
}
duration = d32;
}
if (duration != 0 && mHeaderTimescale != 0 && duration < UINT64_MAX / 1000000) {
AMediaFormat_setInt64(mFileMetaData,
AMEDIAFORMAT_KEY_DURATION, duration * 1000000 / mHeaderTimescale);
} String8 s;
if (convertTimeToDate(creationTime, &s)) {
AMediaFormat_setString(mFileMetaData, AMEDIAFORMAT_KEY_DATE, s.string());
} break;
} // 将mMdatFound置为true,并将chunk_size返回
case FOURCC("mdat"):
{
mMdatFound = true; *offset += chunk_size;
break;
} // hdlr中的handler_type并不会作为mime type,但是应该是可以用来确定audio和video
case FOURCC("hdlr"):
{
*offset += chunk_size; if (underQTMetaPath(mPath, 3)) {
break;
} uint32_t buffer;
if (mDataSource->readAt(
data_offset + 8, &buffer, 4) < 4) {
return ERROR_IO;
} uint32_t type = ntohl(buffer);
// For the 3GPP file format, the handler-type within the 'hdlr' box
// shall be 'text'. We also want to support 'sbtl' handler type
// for a practical reason as various MPEG4 containers use it.
if (type == FOURCC("text") || type == FOURCC("sbtl")) {
if (mLastTrack != NULL) {
AMediaFormat_setString(mLastTrack->meta,
AMEDIAFORMAT_KEY_MIME, MEDIA_MIMETYPE_TEXT_3GPP);
}
} break;
} // ...... // 这个box我记得可能是存储的媒体的缩略图等信息
case FOURCC("tx3g"):
{
if (mLastTrack == NULL)
return ERROR_MALFORMED; // complain about ridiculous chunks
if (chunk_size > kMaxAtomSize) {
return ERROR_MALFORMED;
} // complain about empty atoms
if (chunk_data_size <= 0) {
ALOGE("b/124330204");
android_errorWriteLog(0x534e4554, "124330204");
return ERROR_MALFORMED;
} // should fill buffer based on "data_offset" and "chunk_data_size"
// instead of *offset and chunk_size;
// but we've been feeding the extra data to consumers for multiple releases and
// if those apps are compensating for it, we'd break them with such a change
// if (mLastTrack->mTx3gBuffer == NULL) {
mLastTrack->mTx3gSize = 0;
mLastTrack->mTx3gFilled = 0;
}
if (mLastTrack->mTx3gSize - mLastTrack->mTx3gFilled < chunk_size) {
size_t growth = kTx3gGrowth;
if (growth < chunk_size) {
growth = chunk_size;
}
// although this disallows 2 tx3g atoms of nearly kMaxAtomSize...
if ((uint64_t) mLastTrack->mTx3gSize + growth > kMaxAtomSize) {
ALOGE("b/124330204 - too much space");
android_errorWriteLog(0x534e4554, "124330204");
return ERROR_MALFORMED;
}
uint8_t *updated = (uint8_t *)realloc(mLastTrack->mTx3gBuffer,
mLastTrack->mTx3gSize + growth);
if (updated == NULL) {
return ERROR_MALFORMED;
}
mLastTrack->mTx3gBuffer = updated;
mLastTrack->mTx3gSize += growth;
} if ((size_t)(mDataSource->readAt(*offset,
mLastTrack->mTx3gBuffer + mLastTrack->mTx3gFilled,
chunk_size))
< chunk_size) { // advance read pointer so we don't end up reading this again
*offset += chunk_size;
return ERROR_IO;
} mLastTrack->mTx3gFilled += chunk_size;
*offset += chunk_size;
break;
} case FOURCC("ac-3"):
{
*offset += chunk_size;
// bypass ac-3 if parse fail
if (parseAC3SpecificBox(data_offset) != OK) {
if (mLastTrack != NULL) {
ALOGW("Fail to parse ac-3");
mLastTrack->skipTrack = true;
}
}
return OK;
} case FOURCC("ec-3"):
{
*offset += chunk_size;
// bypass ec-3 if parse fail
if (parseEAC3SpecificBox(data_offset) != OK) {
if (mLastTrack != NULL) {
ALOGW("Fail to parse ec-3");
mLastTrack->skipTrack = true;
}
}
return OK;
} case FOURCC("ac-4"):
{
*offset += chunk_size;
// bypass ac-4 if parse fail
if (parseAC4SpecificBox(data_offset) != OK) {
if (mLastTrack != NULL) {
ALOGW("Fail to parse ac-4");
mLastTrack->skipTrack = true;
}
}
return OK;
} case FOURCC("ftyp"):
{
if (chunk_data_size < 8 || depth != 0) {
return ERROR_MALFORMED;
} off64_t stop_offset = *offset + chunk_size;
uint32_t numCompatibleBrands = (chunk_data_size - 8) / 4;
std::set<uint32_t> brandSet;
for (size_t i = 0; i < numCompatibleBrands + 2; ++i) {
if (i == 1) {
// Skip this index, it refers to the minorVersion,
// not a brand.
continue;
} uint32_t brand;
if (mDataSource->readAt(data_offset + 4 * i, &brand, 4) < 4) {
return ERROR_MALFORMED;
} brand = ntohl(brand);
brandSet.insert(brand);
} if (brandSet.count(FOURCC("qt ")) > 0) {
mIsQT = true;
} else {
if (brandSet.count(FOURCC("mif1")) > 0
&& brandSet.count(FOURCC("heic")) > 0) {
ALOGV("identified HEIF image"); mIsHeif = true;
brandSet.erase(FOURCC("mif1"));
brandSet.erase(FOURCC("heic"));
} if (!brandSet.empty()) {
// This means that the file should have moov box.
// It could be any iso files (mp4, heifs, etc.)
mHasMoovBox = true;
if (mIsHeif) {
ALOGV("identified HEIF image with other tracks");
}
}
} *offset = stop_offset; break;
} default:
{
// check if we're parsing 'ilst' for meta keys
// if so, treat type as a number (key-id).
if (underQTMetaPath(mPath, 3)) {
status_t err = parseQTMetaVal(chunk_type, data_offset, chunk_data_size);
if (err != OK) {
return err;
}
} *offset += chunk_size;
break;
}
} return OK;
}

Sample Table持有一个DataSource,解析stts、stss等box时把对应的偏移量以及结束位置初始化了SampleTable,

MPEG4Extractor::getTrack

MediaTrackHelper *MPEG4Extractor::getTrack(size_t index) {
status_t err;
if ((err = readMetaData()) != OK) {
return NULL;
}
// 循环拿到nIndex对应的track
Track *track = mFirstTrack;
while (index > 0) {
if (track == NULL) {
return NULL;
} track = track->next;
--index;
} if (track == NULL) {
return NULL;
} // 检查trackID
Trex *trex = NULL;
int32_t trackId;
if (AMediaFormat_getInt32(track->meta, AMEDIAFORMAT_KEY_TRACK_ID, &trackId)) {
for (size_t i = 0; i < mTrex.size(); i++) {
Trex *t = &mTrex.editItemAt(i);
if (t->track_ID == (uint32_t) trackId) {
trex = t;
break;
}
}
} else {
ALOGE("b/21657957");
return NULL;
} ALOGV("getTrack called, pssh: %zu", mPssh.size());
// 检查mime
const char *mime;
if (!AMediaFormat_getString(track->meta, AMEDIAFORMAT_KEY_MIME, &mime)) {
return NULL;
} sp<ItemTable> itemTable;
// 如果是avc,那么需要检查CSD buffer
if (!strcasecmp(mime, MEDIA_MIMETYPE_VIDEO_AVC)) {
void *data;
size_t size;
if (!AMediaFormat_getBuffer(track->meta, AMEDIAFORMAT_KEY_CSD_AVC, &data, &size)) {
return NULL;
} const uint8_t *ptr = (const uint8_t *)data;
// 读取CSB buffer,检查configurationVersion值
if (size < 7 || ptr[0] != 1) { // configurationVersion == 1
return NULL;
}
} else if (!strcasecmp(mime, MEDIA_MIMETYPE_VIDEO_HEVC)
|| !strcasecmp(mime, MEDIA_MIMETYPE_IMAGE_ANDROID_HEIC)) {
void *data;
size_t size;
if (!AMediaFormat_getBuffer(track->meta, AMEDIAFORMAT_KEY_CSD_HEVC, &data, &size)) {
return NULL;
} const uint8_t *ptr = (const uint8_t *)data; if (size < 22 || ptr[0] != 1) { // configurationVersion == 1
return NULL;
}
if (!strcasecmp(mime, MEDIA_MIMETYPE_IMAGE_ANDROID_HEIC)) {
itemTable = mItemTable;
}
} else if (!strcasecmp(mime, MEDIA_MIMETYPE_VIDEO_DOLBY_VISION)) {
void *data;
size_t size;
if (!AMediaFormat_getBuffer(track->meta, AMEDIAFORMAT_KEY_CSD_2, &data, &size)) {
return NULL;
} const uint8_t *ptr = (const uint8_t *)data; // dv_major.dv_minor Should be 1.0 or 2.1
if (size != 24 || ((ptr[0] != 1 || ptr[1] != 0) && (ptr[0] != 2 || ptr[1] != 1))) {
return NULL;
}
} else if (!strcasecmp(mime, MEDIA_MIMETYPE_VIDEO_AV1)) {
void *data;
size_t size;
if (!AMediaFormat_getBuffer(track->meta, AMEDIAFORMAT_KEY_CSD_0, &data, &size)) {
return NULL;
}
const uint8_t *ptr = (const uint8_t *)data; if (size < 5 || ptr[0] != 0x81) { // configurationVersion == 1
return NULL;
}
} ALOGV("track->elst_shift_start_ticks :%" PRIu64, track->elst_shift_start_ticks); uint64_t elst_initial_empty_edit_ticks = 0;
if (mHeaderTimescale != 0) {
// Convert empty_edit_ticks from movie timescale to media timescale.
uint64_t elst_initial_empty_edit_ticks_mul = 0, elst_initial_empty_edit_ticks_add = 0;
if (__builtin_mul_overflow(track->elst_initial_empty_edit_ticks, track->timescale,
&elst_initial_empty_edit_ticks_mul) ||
__builtin_add_overflow(elst_initial_empty_edit_ticks_mul, (mHeaderTimescale / 2),
&elst_initial_empty_edit_ticks_add)) {
ALOGE("track->elst_initial_empty_edit_ticks overflow");
return nullptr;
}
elst_initial_empty_edit_ticks = elst_initial_empty_edit_ticks_add / mHeaderTimescale;
}
ALOGV("elst_initial_empty_edit_ticks in MediaTimeScale :%" PRIu64,
elst_initial_empty_edit_ticks); // 创建MediaSource并返回
MPEG4Source* source =
new MPEG4Source(track->meta, mDataSource, track->timescale, track->sampleTable,
mSidxEntries, trex, mMoofOffset, itemTable,
track->elst_shift_start_ticks, elst_initial_empty_edit_ticks);
if (source->init() != OK) {
delete source;
return NULL;
}
return source;
}

MPEG4Source::read

media_status_t MPEG4Source::read(
MediaBufferHelper **out, const ReadOptions *options) {
Mutex::Autolock autoLock(mLock); CHECK(mStarted); if (options != nullptr && options->getNonBlocking() && !mBufferGroup->has_buffers()) {
*out = nullptr;
return AMEDIA_ERROR_WOULD_BLOCK;
} if (mFirstMoofOffset > 0) {
return fragmentedRead(out, options);
} *out = NULL; int64_t targetSampleTimeUs = -1; int64_t seekTimeUs;
ReadOptions::SeekMode mode; // 用于seek读取
if (options && options->getSeekTo(&seekTimeUs, &mode)) {
ALOGV("seekTimeUs:%" PRId64, seekTimeUs);
if (mIsHeif) {
CHECK(mSampleTable == NULL);
CHECK(mItemTable != NULL);
int32_t imageIndex;
if (!AMediaFormat_getInt32(mFormat, AMEDIAFORMAT_KEY_TRACK_ID, &imageIndex)) {
return AMEDIA_ERROR_MALFORMED;
} status_t err;
if (seekTimeUs >= 0) {
err = mItemTable->findImageItem(imageIndex, &mCurrentSampleIndex);
} else {
err = mItemTable->findThumbnailItem(imageIndex, &mCurrentSampleIndex);
}
if (err != OK) {
return AMEDIA_ERROR_UNKNOWN;
}
} else {
// 解析出seek mode
uint32_t findFlags = 0;
switch (mode) {
case ReadOptions::SEEK_PREVIOUS_SYNC:
findFlags = SampleTable::kFlagBefore;
break;
case ReadOptions::SEEK_NEXT_SYNC:
findFlags = SampleTable::kFlagAfter;
break;
case ReadOptions::SEEK_CLOSEST_SYNC:
case ReadOptions::SEEK_CLOSEST:
findFlags = SampleTable::kFlagClosest;
break;
case ReadOptions::SEEK_FRAME_INDEX:
findFlags = SampleTable::kFlagFrameIndex;
break;
default:
CHECK(!"Should not be here.");
break;
}
if( mode != ReadOptions::SEEK_FRAME_INDEX) {
int64_t elstInitialEmptyEditUs = 0, elstShiftStartUs = 0;
if (mElstInitialEmptyEditTicks > 0) {
elstInitialEmptyEditUs = ((long double)mElstInitialEmptyEditTicks * 1000000) /
mTimescale;
/* Sample's composition time from ctts/stts entries are non-negative(>=0).
* Hence, lower bound on seekTimeUs is 0.
*/
seekTimeUs = std::max(seekTimeUs - elstInitialEmptyEditUs, (int64_t)0);
}
if (mElstShiftStartTicks > 0) {
elstShiftStartUs = ((long double)mElstShiftStartTicks * 1000000) / mTimescale;
seekTimeUs += elstShiftStartUs;
}
ALOGV("shifted seekTimeUs:%" PRId64 ", elstInitialEmptyEditUs:%" PRIu64
", elstShiftStartUs:%" PRIu64, seekTimeUs, elstInitialEmptyEditUs,
elstShiftStartUs);
} uint32_t sampleIndex;
// 调用Sample Table的findSampleAttime方法,根据seek mode来查找到seek sample index
status_t err = mSampleTable->findSampleAtTime(
seekTimeUs, 1000000, mTimescale,
&sampleIndex, findFlags); if (mode == ReadOptions::SEEK_CLOSEST
|| mode == ReadOptions::SEEK_FRAME_INDEX) {
// We found the closest sample already, now we want the sync
// sample preceding it (or the sample itself of course), even
// if the subsequent sync sample is closer.
findFlags = SampleTable::kFlagBefore;
} uint32_t syncSampleIndex = sampleIndex;
// assume every non-USAC audio sample is a sync sample. This works around
// seek issues with files that were incorrectly written with an
// empty or single-sample stss block for the audio track
if (err == OK && (!mIsAudio || mIsUsac)) {
err = mSampleTable->findSyncSampleNear(
sampleIndex, &syncSampleIndex, findFlags);
} // 获取到sample对应的开始位置以及长度
uint64_t sampleTime;
if (err == OK) {
err = mSampleTable->getMetaDataForSample(
sampleIndex, NULL, NULL, &sampleTime);
} if (err != OK) {
if (err == ERROR_OUT_OF_RANGE) {
// An attempt to seek past the end of the stream would
// normally cause this ERROR_OUT_OF_RANGE error. Propagating
// this all the way to the MediaPlayer would cause abnormal
// termination. Legacy behaviour appears to be to behave as if
// we had seeked to the end of stream, ending normally.
return AMEDIA_ERROR_END_OF_STREAM;
}
ALOGV("end of stream");
return AMEDIA_ERROR_UNKNOWN;
} if (mode == ReadOptions::SEEK_CLOSEST
|| mode == ReadOptions::SEEK_FRAME_INDEX) {
if (mElstInitialEmptyEditTicks > 0) {
sampleTime += mElstInitialEmptyEditTicks;
}
if (mElstShiftStartTicks > 0){
if (sampleTime > mElstShiftStartTicks) {
sampleTime -= mElstShiftStartTicks;
} else {
sampleTime = 0;
}
}
targetSampleTimeUs = (sampleTime * 1000000ll) / mTimescale;
}
// 记录下当前读取的sampleIndex
mCurrentSampleIndex = syncSampleIndex;
} if (mBuffer != NULL) {
mBuffer->release();
mBuffer = NULL;
} // fall through
} off64_t offset = 0;
size_t size = 0;
int64_t cts;
uint64_t stts;
bool isSyncSample;
bool newBuffer = false;
if (mBuffer == NULL) {
newBuffer = true; status_t err;
if (!mIsHeif) {
// 读取出sample对应的offset、size
err = mSampleTable->getMetaDataForSample(mCurrentSampleIndex, &offset, &size,
(uint64_t*)&cts, &isSyncSample, &stts);
if(err == OK) {
if (mElstInitialEmptyEditTicks > 0) {
cts += mElstInitialEmptyEditTicks;
}
// 计算DTS
if (mElstShiftStartTicks > 0) {
// cts can be negative. for example, initial audio samples for gapless playback.
cts -= (int64_t)mElstShiftStartTicks;
}
}
} else {
err = mItemTable->getImageOffsetAndSize(
options && options->getSeekTo(&seekTimeUs, &mode) ?
&mCurrentSampleIndex : NULL, &offset, &size); cts = stts = 0;
isSyncSample = 0;
ALOGV("image offset %lld, size %zu", (long long)offset, size);
} if (err != OK) {
if (err == ERROR_END_OF_STREAM) {
return AMEDIA_ERROR_END_OF_STREAM;
}
return AMEDIA_ERROR_UNKNOWN;
} // 猜测是向内存池申请内存块
err = mBufferGroup->acquire_buffer(&mBuffer); if (err != OK) {
CHECK(mBuffer == NULL);
return AMEDIA_ERROR_UNKNOWN;
}
if (size > mBuffer->size()) {
ALOGE("buffer too small: %zu > %zu", size, mBuffer->size());
mBuffer->release();
mBuffer = NULL;
return AMEDIA_ERROR_UNKNOWN; // ERROR_BUFFER_TOO_SMALL
}
} // ......
// 读到avc/hevc数据,处理数据并返回给上层
else {
// Whole NAL units are returned but each fragment is prefixed by
// the start code (0x00 00 00 01).
ssize_t num_bytes_read = 0;
bool mSrcBufferFitsDataToRead = size <= mSrcBufferSize;
if (mSrcBufferFitsDataToRead) {
// 将对应sample读到srcBuffer中
num_bytes_read = mDataSource->readAt(offset, mSrcBuffer, size);
} else {
// We are trying to read a sample larger than the expected max sample size.
// Fall through and let the failure be handled by the following if.
android_errorWriteLog(0x534e4554, "188893559");
} if (num_bytes_read < (ssize_t)size) {
mBuffer->release();
mBuffer = NULL; return mSrcBufferFitsDataToRead ? AMEDIA_ERROR_IO : AMEDIA_ERROR_MALFORMED;
} uint8_t *dstData = (uint8_t *)mBuffer->data();
size_t srcOffset = 0;
size_t dstOffset = 0; // 这里我觉得是一帧视频会有相当多的NALU构成,扫描每个NALU,检查其有效性并且加上NALU起始标志位
while (srcOffset < size) {
bool isMalFormed = !isInRange((size_t)0u, size, srcOffset, mNALLengthSize);
size_t nalLength = 0;
if (!isMalFormed) {
nalLength = parseNALSize(&mSrcBuffer[srcOffset]);
srcOffset += mNALLengthSize;
isMalFormed = !isInRange((size_t)0u, size, srcOffset, nalLength);
} if (isMalFormed) {
//if nallength abnormal,ignore it.
ALOGW("abnormal nallength, ignore this NAL");
srcOffset = size;
break;
} if (nalLength == 0) {
continue;
} if (dstOffset > SIZE_MAX - 4 ||
dstOffset + 4 > SIZE_MAX - nalLength ||
dstOffset + 4 + nalLength > mBuffer->size()) {
ALOGE("b/27208621 : %zu %zu", dstOffset, mBuffer->size());
android_errorWriteLog(0x534e4554, "27208621");
mBuffer->release();
mBuffer = NULL;
return AMEDIA_ERROR_MALFORMED;
} // 给HEVC 和 AVC 加上 NALU 的起始标志位
dstData[dstOffset++] = 0;
dstData[dstOffset++] = 0;
dstData[dstOffset++] = 0;
dstData[dstOffset++] = 1;
memcpy(&dstData[dstOffset], &mSrcBuffer[srcOffset], nalLength);
srcOffset += nalLength;
dstOffset += nalLength;
}
CHECK_EQ(srcOffset, size);
CHECK(mBuffer != NULL);
mBuffer->set_range(0, dstOffset); // 设定当前读取帧的PTS以及duration
AMediaFormat *meta = mBuffer->meta_data();
AMediaFormat_clear(meta);
AMediaFormat_setInt64(
meta, AMEDIAFORMAT_KEY_TIME_US, ((long double)cts * 1000000) / mTimescale);
AMediaFormat_setInt64(
meta, AMEDIAFORMAT_KEY_DURATION, ((long double)stts * 1000000) / mTimescale); if (targetSampleTimeUs >= 0) {
AMediaFormat_setInt64(
meta, AMEDIAFORMAT_KEY_TARGET_TIME, targetSampleTimeUs);
} if (mIsAVC) {
uint32_t layerId = FindAVCLayerId(
(const uint8_t *)mBuffer->data(), mBuffer->range_length());
AMediaFormat_setInt32(meta, AMEDIAFORMAT_KEY_TEMPORAL_LAYER_ID, layerId);
} else if (mIsHEVC) {
int32_t layerId = parseHEVCLayerId(
(const uint8_t *)mBuffer->data(), mBuffer->range_length());
if (layerId >= 0) {
AMediaFormat_setInt32(meta, AMEDIAFORMAT_KEY_TEMPORAL_LAYER_ID, layerId);
}
} if (isSyncSample) {
AMediaFormat_setInt32(meta, AMEDIAFORMAT_KEY_IS_SYNC_FRAME, 1);
} // 将sampleindex向后移动
++mCurrentSampleIndex;
// 将数据返回给上层
*out = mBuffer;
mBuffer = NULL; return AMEDIA_OK;
}
}

mp4封装格式与MPEG4Extractor的更多相关文章

  1. mp4封装格式各box类型讲解及IBP帧计算

    mp4封装格式各box类型讲解及IBP帧计算 目录 mp4封装格式各box类型讲解及IBP帧计算 box ftyp box moov box mvhd box (Movie Header Box) t ...

  2. ISO/IEC 15444-12 MP4 封装格式标准摘录 5

    目录 Segments Segment Type Box Segment Index Box Subsegment Index Box Producer Reference Time Box Supp ...

  3. ISO/IEC 15444-12 MP4 封装格式标准摘录 4

    目录 Movie Fragments Movie Extends Box Movie Extends Header Box Track Extends Box Movie Fragment Box M ...

  4. ISO/IEC 15444-12 MP4 封装格式标准摘录 3

    目录 Track Data Layout Structures Data Information Box Data Reference Box Sample Size Boxes Compact Sa ...

  5. ISO/IEC 15444-12 MP4 封装格式标准摘录 2

    目录 Track Media Structure Media Box Media Header Box Handler Reference Box Media Information Box Medi ...

  6. H.264标准(一)mp4封装格式详解

    在网络层,互联网提供所有应用程序都要使用的两种类型的服务,尽管目前理解这些服务的细节并不重要,但在所有TCP/IP概述中,都不能忽略他们: 无连接分组交付服务(Connectionless Packe ...

  7. 最简单的基于FFmpeg的封装格式处理:视音频复用器(muxer)

    ===================================================== 最简单的基于FFmpeg的封装格式处理系列文章列表: 最简单的基于FFmpeg的封装格式处理 ...

  8. 多媒体封装格式详解---MP4

    MP4文件格式详解——结构概述 http://blog.csdn.net/pirateleo/article/details/7061452 一.基本概念 1. 文件,由许多Box和FullBox组成 ...

  9. 最简单的基于FFMPEG的封装格式转换器(无编解码)

    本文介绍一个基于FFMPEG的封装格式转换器.所谓的封装格式转换,就是在AVI,FLV,MKV,MP4这些格式之间转换(相应.avi,.flv,.mkv,.mp4文件).须要注意的是,本程序并不进行视 ...

  10. 最简单的基于FFmpeg的封装格式处理:视音频分离器(demuxer)

    ===================================================== 最简单的基于FFmpeg的封装格式处理系列文章列表: 最简单的基于FFmpeg的封装格式处理 ...

随机推荐

  1. Unity-PC 端调用SpVoice语音 (文字转语音)

    第一步引用文件 在VS当中 点击项目->添加引用-> 搜索Microsoft Speech Objecet Library 然后选中前面的白色方块点击确定就行了 插入之后 你的引用库中会多 ...

  2. k8s之operator

    背景 数字经济的兴起推动了云计算.物联网.大数据行业的快速蓬勃发展,对数据中心提出了更高的要求,同时,用户对于数据库运维自动化的需求越来越高,数据库即服务的需求越来越强烈. 随着k8s的普及以及云原生 ...

  3. 第十七篇:Django入门

    一.模板 二.BootStrap使用 三.web框架简绍 四.Django使用 五.创建APP 六.APP各目录功能 七.静态文件处理 八.模板语言 九.请求过程

  4. Go 单元测试基本介绍

    目录 一.单元测试基本介绍 1.1 什么是单元测试? 1.2 如何写好单元测试 1.3 单元测试的优点 1.4 单元测试的设计原则 二.Go语言测试 2.1 Go单元测试概要 2.2 Go单元测试基本 ...

  5. easyx的使用,鼠标交互(3.0)

    本文从B站学习,借鉴: 学习视频地址:鼠标操作(旧版)_哔哩哔哩_bilibili

  6. 【笔记】go语言--字符与字符串处理

    [笔记]go语言--字符与字符串处理 rune相当于go的char 使用range遍历pos,rune对(遍历出来是不连续的) 使用utf8.RuneCountInString获得字符数量 使用len ...

  7. 力扣303(java)-区域和检索-数组不可变(简单)

    题目: 给定一个整数数组  nums,处理以下类型的多个查询: 计算索引 left 和 right (包含 left 和 right)之间的 nums 元素的 和 ,其中 left <= rig ...

  8. 云原生消息队列Pulsar浅析

    简介: 云原生消息队列Pulsar浅析 一.前言 Pulsar是一个多租户,高性能的服务间消息解决方案.最初由Yahoo开发,现在由Apache Software Foundation负责.Pulsa ...

  9. Win32 使用 CreateProcess 方法让任务管理器里的命令行不显示应用文件路径

    本文记录一个 Win32 的有趣行为,调用 CreateProcess 方法传入特别的参数,可以让任务管理器里的命令行不显示应用文件路径 开始之前,先看看下面这张有趣的图片 可以看到我编写的 Svca ...

  10. 阿里面试Redis最常问的三个问题:缓存雪崩、击穿、穿透(带答案)

    那提到Redis我相信各位在面试,或者实际开发过程中对缓存雪崩,穿透,击穿也不陌生吧,就算没遇到过但是你肯定听过,那三者到底有什么区别,我们又应该怎么去防止这样的情况发生呢,我们有请下一位受害者. 面 ...