From: Matthias Clasen Date: Mon, 9 Aug 2010 03:43:29 +0000 (-0400) Subject: Refer to GUnicodeScript docs instead of listing scripts explicitly X-Git-Tag: 2.25.14~36 X-Git-Url: http://review.tizen.org/git/?a=commitdiff_plain;h=4e42893369c8b8092de7feedb447ca538f8dccf2;p=platform%2Fupstream%2Fglib.git Refer to GUnicodeScript docs instead of listing scripts explicitly --- diff --git a/docs/reference/glib/regex-syntax.sgml b/docs/reference/glib/regex-syntax.sgml index 901b447..5a86a74 100644 --- a/docs/reference/glib/regex-syntax.sgml +++ b/docs/reference/glib/regex-syntax.sgml @@ -188,6 +188,12 @@ as part of the pattern. +Note that the C compiler interprets backslash in strings itself, therefore +you need to duplicate all \ characters when you put a regular expression +in a C string, like "\\d{3}". + + + If you want to remove the special meaning from a sequence of characters, you can do so by putting them between \Q and \E. The \Q...\E sequence is recognized both inside and outside character @@ -524,78 +530,12 @@ example, \p{Greek} or \P{Han}. Those that are not part of an identified script are lumped together as -"Common". The current list of scripts is: +"Common". The current list of scripts can be found in the documentation for +the #GUnicodeScript enumeration. Script names for use with \p{} can be +found by replacing all spaces with underscores, e.g. for Linear B use +\p{Linear_B}. - -Arabic -Armenian -Balinese -Bengali -Bopomofo -Braille -Buginese -Buhid -Canadian_Aboriginal -Cherokee -Common -Coptic -Cuneiform -Cypriot -Cyrillic -Deseret -Devanagari -Ethiopic -Georgian -Glagolitic -Gothic -Greek -Gujarati -Gurmukhi -Han -Hangul -Hanunoo -Hebrew -Hiragana -Inherited -Kannada -Katakana -Kharoshthi -Khmer -Lao -Latin -Limbu -Linear_B -Malayalam -Mongolian -Myanmar -New_Tai_Lue -Nko -Ogham -Old_Italic -Old_Persian -Oriya -Osmanya -Phags_Pa -Phoenician -Runic -Shavian -Sinhala -Syloti_Nagri -Syriac -Tagalog -Tagbanwa -Tai_Le -Tamil -Telugu -Thaana -Thai -Tibetan -Tifinagh -Ugaritic -Yi - - Each character has exactly one general category property, specified by a two-letter abbreviation. For compatibility with Perl, negation can be specified