performance gain on SSE2 is approx. 20-25%. altivec
is not tested. performance for unsigned long arithmetic
is unchanged.
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
long d0, d1, d2, d3;
const long * const data = buf;
+ /* use vector optimized zero check if possible */
+ if (can_use_buffer_find_nonzero_offset(buf, len)) {
+ return buffer_find_nonzero_offset(buf, len) == len;
+ }
+
assert(len % (4 * sizeof(long)) == 0);
len /= sizeof(long);