Home » U++ Library support » U++ Core » String.Mid() and Unicode
String.Mid() and Unicode [message #28665] |
Fri, 10 September 2010 10:36  |
jeremy_c
Messages: 175 Registered: August 2007 Location: Ohio, USA
|
Experienced Member |
|
|
I know a given string is at position 10 (example) and is 3 characters long. I don't know if that's 3 Unicode characters or 3 ASCII characters.
It seems that String.Mid(pos, len) is returning 3 bytes of data, not 3 characters, is that correct? If so, how can I retrieve 3 characters from a string at position XYZ?
Jeremy
|
|
|
Re: String.Mid() and Unicode [message #28667 is a reply to message #28665] |
Fri, 10 September 2010 11:23   |
|
jeremy_c wrote on Fri, 10 September 2010 10:36 | I know a given string is at position 10 (example) and is 3 characters long. I don't know if that's 3 Unicode characters or 3 ASCII characters.
It seems that String.Mid(pos, len) is returning 3 bytes of data, not 3 characters, is that correct? If so, how can I retrieve 3 characters from a string at position XYZ?
Jeremy
|
Hi Jeremy,
I am by far not an expert on the topic, but I believe you need WString for this. String really works with 8bit characters only, while WString encodes the characters to 16 bits. Little example: String s="aβ¢d€f";
WString w(s);
for(int i=0;i<s.GetCount();i++){
Cout()<<IntStr(s[i])<<" ";
}
Cout()<<"\n";
for(int i=0;i<w.GetCount();i++){
Cout()<<IntStr(w[i])<<" ";
}
(Well, the example was more to assure myself that I'am not talking nonsense )
Honza
|
|
|
Re: String.Mid() and Unicode [message #28671 is a reply to message #28667] |
Fri, 10 September 2010 13:55  |
cbpporter
Messages: 1427 Registered: September 2007
|
Ultimate Contributor |
|
|
All subset and indexing functions work on code points (the smallest binary unit to represent part of the abstract character when stored in memory, byte for String, word for WString) and not code units (the abstract character). You can usually get the right behavior if you use WString, but it is more of a hack/convenience.
|
|
|
Goto Forum:
Current Time: Sat Apr 26 14:36:55 CEST 2025
Total time taken to generate the page: 0.01074 seconds
|