WoWInterface - View Single Post

ravagernl · 05-25-13, 02:04 AM

Originally Posted by Rainrider

I also don't understand what exactly is .[\128-\191]* supposed to mean but it makes it work for non-english characters (haven't tested chinese and korean though, but I do not look for such functionality).

Sorry for writing in between.

I think range 128 - 191 match the diacritic characters used to compose a unicode character (first 128 are basically ascii characters).

EDIT: Found this on http://lua-users.org/wiki/LuaUnicode:

Happily UTF-8 is designed so that it is relatively easy to count the number of unicode symbols in a string: simply count the number of octets that are in the ranges 0x00 to 0x7f (inclusive) or 0xC2 to 0xF4 (inclusive). (In decimal, 0-127 and 194-244.) These are the codes which can start a UTF-8 character code. Octets 0xC0, 0xC1 and 0xF5 to 0xFF (192, 193 and 245-255) cannot appear in a conforming UTF-8 sequence; octets in the range 0x80 to 0xBF (128-191) can only appear in the second and subsequent octets of a multi-octet encoding. Remember that you cannot use \0 in a Lua pattern.