Currently the scan uses Boyer-moore method and its performance is good.
but, it can be optimized from an implementation of view.
The original scan code is implemented by byte array and index-based access.
In _scan_for_start_code(), the index is increasing from start to end and the
base address of the byte array is referred to as return value.
In the case, index-based access can be replaced by pointer access, which
improve the performance by removing index-related operations.
Its performace is enhanced by approximately 8% on arm-based embedded devices.
Although it seems trivial, it can affect the overall performance because the
_scan_for_start_code() function is very often called when H.264/H.265 video is
played.
In addition, the technique can apply for all architectures and it is good in
view of readability and maintainability.
https://bugzilla.gnome.org/show_bug.cgi?id=731442
static inline gint
_scan_for_start_code (const guint8 * data, guint offset, guint size)
{
- guint i = 0;
-
- while (i <= (size - 4)) {
- if (data[i + 2] > 1) {
- i += 3;
- } else if (data[i + 1]) {
- i += 2;
- } else if (data[i] || data[i + 2] != 1) {
- i++;
+ guint8 *pdata = (guint8 *) data;
+ guint8 *pend = (guint8 *) (data + size - 4);
+
+ while (pdata <= pend) {
+ if (pdata[2] > 1) {
+ pdata += 3;
+ } else if (pdata[1]) {
+ pdata += 2;
+ } else if (*pdata || pdata[2] != 1) {
+ pdata++;
} else {
- break;
+ return (pdata - data + offset);
}
}
- if (i <= (size - 4))
- return i + offset;
-
/* nothing found */
return -1;
}