Home » Community » Newbie corner » Best Unicode strategy
Best Unicode strategy [message #34325] |
Sat, 12 November 2011 09:34 |
conrad
Messages: 12 Registered: November 2011 Location: Norway
|
Promising Member |
|
|
Case:
Database access via a managed c++ wrapper package to .Net assemblies. In the .Net universe Strings are utf16.
As I understand it, U++ is by default "unicode aware" by virtue of its widgets understanding utf8.
A simple test revealed that I can feed a mixed English & Chinese char* to PromptOK like:
PromptOK( "Chinese (汉语/漢語, Pinyin: Hànyǔ; 华语/華語, Huáyǔ; or 中文, Zhōngwén)" );
Since most U++ text widgets take in String (not WString) I need to
make a decision as to what to transport.
As I see it there are 2 options:
1. Use String and pay the cost for many needed string manipulations since String itself is not really utf8 aware. I.o.w. many temp conversions to WString and back.
2. Use WString and convert to a String (utf8 somehow - not with ToString() since that uses a character set) before passing on to U++ widgets/drawing routines.
I'd like to hear what strategy other user of U++ have used?
I wonder if I have missed out on something or not dug deep enough.
Conrad
|
|
|
Re: Best Unicode strategy [message #34326 is a reply to message #34325] |
Sat, 12 November 2011 12:09 |
|
mirek
Messages: 13975 Registered: November 2005
|
Ultimate Member |
|
|
conrad wrote on Sat, 12 November 2011 03:34 | Case:
Database access via a managed c++ wrapper package to .Net assemblies. In the .Net universe Strings are utf16.
As I understand it, U++ is by default "unicode aware" by virtue of its widgets understanding utf8.
A simple test revealed that I can feed a mixed English & Chinese char* to PromptOK like:
PromptOK( "Chinese (汉语/漢語, Pinyin: Hànyǔ; 华语/華語, Huáyǔ; or 中文, Zhōngwén)" );
Since most U++ text widgets take in String (not WString) I need to
make a decision as to what to transport.
As I see it there are 2 options:
1. Use String and pay the cost for many needed string manipulations since String itself is not really utf8 aware. I.o.w. many temp conversions to WString and back.
2. Use WString and convert to a String (utf8 somehow - not with ToString() since that uses a character set) before passing on to U++ widgets/drawing routines.
I'd like to hear what strategy other user of U++ have used?
I wonder if I have missed out on something or not dug deep enough.
Conrad
|
My default strategy is to use utf-8 (or default charset, for legacy) String everywhere and only convert to WString if necessary; this involves either manipulation with individual characters (in that situation, I convert String->WString, do the operation, then WString->String) or dealing with 3rd party (your case).
String itself is much more efficient that WString in many situations, like storage, comparison, copy, maps. Dealing with database library, you are likely to spend much more time in database than by converting String->WString.
BTW, ToString (and ToWString) is perfectly adequate too (it converts from/to default charset, which, by default, is utf-8).
Mirek
|
|
|
|
Goto Forum:
Current Time: Sat Apr 20 02:48:48 CEST 2024
Total time taken to generate the page: 0.02106 seconds
|