I will not otherwise mention that stealing .c code from the core is a
dangerous practice.
This is actually a bug in the module, which had been masked until now.
The first two parameters to utf8_to_uvchr_buf() are both U8*. But both
's' and PL_bufend are char*. The 's' has a cast to U8* in the failing
line, but not PL_bufend.
Interestingly, the line in the official toke.c (introduced in
4b88fb76)
has always been right, so the stealer didn't copy it correctly.
What
de69f3af3 did was turn this former function call into a macro that
manipulates the parameters and calls another function, thereby removing
a layer of function call overhead. The manipulation involves
subtracting 's' from PL_bufend, and this fails to compile due to the
missing cast on the latter parameter.
The problem goes away if the macro casts both parameters to U8*, and
that is what this commit does.
#define uvchr_to_utf8_flags(d,uv,flags) \
uvoffuni_to_utf8_flags(d,NATIVE_TO_UNI(uv),flags)
#define utf8_to_uvchr_buf(s, e, lenp) \
- utf8n_to_uvchr(s, (e) - (s), lenp, \
+ utf8n_to_uvchr(s, (U8*)(e) - (U8*)(s), lenp, \
ckWARN_d(WARN_UTF8) ? 0 : UTF8_ALLOW_ANY)
#define to_uni_fold(c, p, lenp) _to_uni_fold_flags(c, p, lenp, FOLD_FLAGS_FULL)