Overview
Examples
Screenshots
Comparisons
Applications
Download
Documentation
Tutorials
Bazaar
Status & Roadmap
FAQ
Authors & License
Forums
Funding Ultimate++
Search on this site
Search in forums












SourceForge.net Logo
Home » U++ Library support » U++ Core » String.Mid() and Unicode
String.Mid() and Unicode [message #28665] Fri, 10 September 2010 10:36 Go to next message
jeremy_c is currently offline  jeremy_c
Messages: 175
Registered: August 2007
Location: Ohio, USA
Experienced Member
I know a given string is at position 10 (example) and is 3 characters long. I don't know if that's 3 Unicode characters or 3 ASCII characters.

It seems that String.Mid(pos, len) is returning 3 bytes of data, not 3 characters, is that correct? If so, how can I retrieve 3 characters from a string at position XYZ?

Jeremy
Re: String.Mid() and Unicode [message #28667 is a reply to message #28665] Fri, 10 September 2010 11:23 Go to previous messageGo to next message
dolik.rce is currently offline  dolik.rce
Messages: 1789
Registered: August 2008
Location: Czech Republic
Ultimate Contributor

jeremy_c wrote on Fri, 10 September 2010 10:36

I know a given string is at position 10 (example) and is 3 characters long. I don't know if that's 3 Unicode characters or 3 ASCII characters.

It seems that String.Mid(pos, len) is returning 3 bytes of data, not 3 characters, is that correct? If so, how can I retrieve 3 characters from a string at position XYZ?

Jeremy

Hi Jeremy,
I am by far not an expert on the topic, but I believe you need WString for this. String really works with 8bit characters only, while WString encodes the characters to 16 bits. Little example:
	String s="aβ¢d€f";
	WString w(s);
	for(int i=0;i<s.GetCount();i++){
		Cout()<<IntStr(s[i])<<" ";
	}
	Cout()<<"\n";
	for(int i=0;i<w.GetCount();i++){
		Cout()<<IntStr(w[i])<<" ";
	}

(Well, the example was more to assure myself that I'am not talking nonsense Smile )

Honza
Re: String.Mid() and Unicode [message #28671 is a reply to message #28667] Fri, 10 September 2010 13:55 Go to previous message
cbpporter is currently offline  cbpporter
Messages: 1401
Registered: September 2007
Ultimate Contributor
All subset and indexing functions work on code points (the smallest binary unit to represent part of the abstract character when stored in memory, byte for String, word for WString) and not code units (the abstract character). You can usually get the right behavior if you use WString, but it is more of a hack/convenience.
Previous Topic: Write an app to start and kill another app periodically
Next Topic: Is there way to get subrange of a vector quickly?
Goto Forum:
  


Current Time: Fri Mar 29 15:23:38 CET 2024

Total time taken to generate the page: 0.01628 seconds