matroskadec: verify seekhead IDs

Some files have SeekHead elements with broken IDs. They mismatch with
the ID of the destination element. These files are written by
"IDMmkvlib0.1" (as identified by the MuxingApp and WritingApp elements),
and the SeekHead IDs are actually endian-swapped.

This confuses the SeekHead logic of the demuxer. It will read some
elements twice, because the SeekHead ID is used to identify and remember
already read elements. With the file at hand, the stream list was
duplicated by reading the Tracks element twice.

Fix this by rejecting invalid EBML IDs in SeekHead entries. (This fix is
relatively specific to the broken file at hand, and doesn't protect
against some other cases of broken SeekHead, such as valid but
mismatching target element IDs.)

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This commit is contained in:
wm4 2015-06-12 13:11:41 +02:00 committed by Michael Niedermayer
parent cfe8a89b00
commit 7e240f9581

View File

@ -995,6 +995,15 @@ static int ebml_parse_nest(MatroskaDemuxContext *matroska, EbmlSyntax *syntax,
return res;
}
static int is_ebml_id_valid(uint32_t id)
{
// Due to endian nonsense in Matroska, the highest byte with any bits set
// will contain the leading length bit. This bit in turn identifies the
// total byte length of the element by its position within the byte.
unsigned int bits = av_log2(id);
return id && (bits + 7) / 8 == (8 - bits % 8);
}
/*
* Allocate and return the entry for the level1 element with the given ID. If
* an entry already exists, return the existing entry.
@ -1005,6 +1014,9 @@ static MatroskaLevel1Element *matroska_find_level1_elem(MatroskaDemuxContext *ma
int i;
MatroskaLevel1Element *elem;
if (!is_ebml_id_valid(id))
return NULL;
// Some files link to all clusters; useless.
if (id == MATROSKA_ID_CLUSTER)
return NULL;