Accept non-characters when validating Unicode
authorSimon McVittie <simon.mcvittie@collabora.co.uk>
Mon, 22 Apr 2013 14:36:32 +0000 (15:36 +0100)
committerSimon McVittie <simon.mcvittie@collabora.co.uk>
Mon, 22 Apr 2013 14:36:32 +0000 (15:36 +0100)
commit6b2add5e70252c513f506f84cc386f47953df48d
treecb5390549936a81565de69ff5ce5039511a99db8
parent540e5692e07d48fb41a4e977e0c9078fa19bd677
Accept non-characters when validating Unicode

Unicode Corrigendum #9 clarifies that the non-characters U+nFFFE
(for n in the range 0 to 0x10), U+nFFFF (for n in the same range),
and U+FDD0..U+FDEF are valid for interchange, and their presence
does not make a string ill-formed.

GLib 2.36 made the corresponding change in its definition of UTF-8
as used by g_utf8_validate() and similar functions.

Bug: https://bugs.freedesktop.org/show_bug.cgi?id=63072
Signed-off-by: Simon McVittie <simon.mcvittie@collabora.co.uk>
dbus/dbus-string.c
test/syntax.c