View Single Post
05-27-13, 12:36 AM   #10
Phanx
Cat.
 
Phanx's Avatar
AddOn Author - Click to view addons
Join Date: Mar 2006
Posts: 5,617
You're getting that ? character because you're actually breaking the Russian character \208\176 (а) in half, keeping only \208 which is not a valid Unicode character by itself. The string functions in WoW are not Unicode aware; they only look at bytes. If you want to support languages with multi-byte characters, you can either use the UTF8 library which provides UTF8-aware versions of some string functions, or you can split it up, count bytes, etc. yourself.

Either way it's going to take more than a simple gsub. You'd probably want to just split it up into "Russian clients use this code path" and "everyone else use this code path" since Korean and Chinese (the other WoW locales with multi-byte characters) generally don't use spaces between words, and cannot be meaningfully abbreviated anyway.

Code:
local old, new = "Echo of a Pandaren Monk"
if GetLocale() == "ruRU" then
    -- complicated version
    new = ""
    for word in string.gmatch(old, "(%S+)%s") do
        new = new .. string.utf8sub(word, 1, 1), " " -- uses UTF8 lib function
    end
    new = new .. strmatch(old, "%S+$")
else
    -- simple version
    new = gsub(old, "(%S[\128-191]*)%S+%s", "%1. ")
end
-- do something with new here
__________________
Retired author of too many addons.
Message me if you're interested in taking over one of my addons.
Don’t message me about addon bugs or programming questions.
  Reply With Quote