U++ forum: Welcome to the forum

Status & Roadmap

Authors & License

Funding Ultimate++

Search on this site

Search in forums

Home » U++ Library support » U++ Libraries and TheIDE: i18n, Unicode and Internationalization » 16 bits wchar

Show: Today's Messages :: Show Polls :: Message Navigator
E-mail to friend

Return to the default flat view

Create a new topic

Submit Reply

Re: 16 bits wchar [message #11796 is a reply to message #8036]

Wed, 26 September 2007 01:56

sergei is currently offline

sergei
Messages: 94
Registered: September 2007

Member

As much as I'd like to see RTL in U++, I agree that unicode should, if possible, be fixed. RTL is built upon unicode, so a solid base - unicode strings storage - is essential. Who knows, maybe tomorrow someone will need Linear B.

I was thinking of UTF-32 as a possible main storage format. I wrote a simple benchmark to see what are the speeds with the 3 sizes of character. Here are the results (source attached):

Size: 64; Iterations: 10000000; 8: 2281; 16: 2125; 32: 2172;
Size: 128; Iterations: 5000000; 8: 1625; 16: 1453; 32: 2391;
Size: 256; Iterations: 2500000; 8: 1328; 16: 1515; 32: 1578;
Size: 512; Iterations: 1250000; 8: 1375; 16: 1141; 32: 1141;
Size: 1024; Iterations: 625000; 8: 1172; 16: 953; 32: 984;
Size: 2048; Iterations: 312500; 8: 1094; 16: 875; 32: 906;
Size: 4096; Iterations: 156250; 8: 1109; 16: 938; 32: 859;
Size: 8192; Iterations: 78125; 8: 1110; 16: 890; 32: 922;
Size: 16384; Iterations: 39062; 8: 1000; 16: 813; 32: 4047;
Size: 32768; Iterations: 19531; 8: 1000; 16: 2250; 32: 3906;
Size: 65536; Iterations: 9765; 8: 1656; 16: 2172; 32: 3812;
Size: 131072; Iterations: 4882; 8: 1625; 16: 2125; 32: 3782;
Size: 262144; Iterations: 2441; 8: 1593; 16: 2110; 32: 3781;
Size: 524288; Iterations: 1220; 8: 1563; 16: 2109; 32: 3984;

IMHO, 32-bit values aren't much worse than 16-bit. For search/replace operations - non-32-bit values would have significant overhead for characters outside main plane.

Converting UTF-32 to other formats shouldn't be a problem. But what I like most is that character would be the same as cell (unlike UTF-16 which might have 20 cells to store 19 characters).

Edit: I didn't mention that I tested basic read/write performance. UTF handling would add overhead to 8 and 16 formats, but not to 32 format. I also remembered the UTF8-EE issue. UTF-32 could solve it easily. IIRC only 21 bits are needed for full unicode, so there's plenty of space to escape to (without overtaking private space).

Attachment: UniCode.cpp
(Size: 1.31KB, Downloaded 542 times)

[Updated on: Wed, 26 September 2007 02:30]

Report message to a moderator

Send a private message to this user

[Message index]

		16 bits wchar By: riri on Mon, 05 February 2007 17:19
		Re: 16 bits wchar By: mirek on Mon, 05 February 2007 23:07
		Re: 16 bits wchar By: cbpporter on Tue, 25 September 2007 22:03
		Re: 16 bits wchar By: mirek on Tue, 25 September 2007 23:18
		Re: 16 bits wchar By: cbpporter on Wed, 26 September 2007 07:43
		Re: 16 bits wchar By: mirek on Wed, 26 September 2007 08:48
		Re: 16 bits wchar By: sergei on Wed, 26 September 2007 14:55
		Re: 16 bits wchar By: cbpporter on Wed, 26 September 2007 15:37
		Re: 16 bits wchar By: mirek on Wed, 26 September 2007 22:40
		Re: 16 bits wchar By: cbpporter on Mon, 01 October 2007 13:24
		Re: 16 bits wchar By: mirek on Mon, 01 October 2007 14:28
		Re: 16 bits wchar By: cbpporter on Wed, 03 October 2007 06:16
		Re: 16 bits wchar By: mirek on Wed, 03 October 2007 10:11
		Re: 16 bits wchar By: cbpporter on Wed, 03 October 2007 10:23
		Re: 16 bits wchar By: mirek on Wed, 03 October 2007 10:42
		Re: 16 bits wchar By: mirek on Wed, 03 October 2007 10:26
		Re: 16 bits wchar By: cbpporter on Wed, 03 October 2007 10:36
		Re: 16 bits wchar By: mirek on Wed, 03 October 2007 12:10
		Re: 16 bits wchar By: cbpporter on Wed, 03 October 2007 14:43
		Re: 16 bits wchar By: mirek on Wed, 03 October 2007 21:40
		Re: 16 bits wchar By: cbpporter on Thu, 04 October 2007 13:15
		Re: 16 bits wchar By: mirek on Thu, 04 October 2007 17:33
		Re: 16 bits wchar By: cbpporter on Thu, 04 October 2007 19:49
		Re: 16 bits wchar By: cbpporter on Fri, 12 October 2007 10:25
		Re: 16 bits wchar By: cbpporter on Fri, 12 October 2007 11:27
		Re: 16 bits wchar By: mirek on Fri, 12 October 2007 11:52
		Re: 16 bits wchar By: mirek on Fri, 12 October 2007 11:59
		Re: 16 bits wchar By: cbpporter on Fri, 12 October 2007 13:54
		Re: 16 bits wchar By: cbpporter on Fri, 12 October 2007 16:25
		Re: 16 bits wchar By: mirek on Fri, 12 October 2007 17:03
		Re: 16 bits wchar By: cbpporter on Mon, 15 October 2007 15:01
		Re: 16 bits wchar By: cbpporter on Mon, 15 October 2007 16:49
		Re: 16 bits wchar By: cbpporter on Tue, 16 October 2007 11:13
		Re: 16 bits wchar By: mirek on Sun, 21 October 2007 20:19
		Re: 16 bits wchar By: cbpporter on Sun, 21 October 2007 23:46
		Re: 16 bits wchar By: mirek on Sun, 21 October 2007 23:57
		Re: 16 bits wchar By: cbpporter on Mon, 22 October 2007 09:34
		Re: 16 bits wchar By: mirek on Mon, 22 October 2007 10:47
		Re: 16 bits wchar By: cbpporter on Mon, 22 October 2007 17:57
		Re: 16 bits wchar By: mirek on Mon, 22 October 2007 19:37
		Re: 16 bits wchar By: mirek on Sun, 21 October 2007 20:14
		Re: 16 bits wchar By: sergei on Wed, 26 September 2007 01:56
		Re: 16 bits wchar By: sergei on Wed, 26 September 2007 16:54
		Re: 16 bits wchar By: cbpporter on Wed, 26 September 2007 19:11
		Re: 16 bits wchar By: cbpporter on Wed, 24 October 2007 11:58
		Re: 16 bits wchar By: mirek on Wed, 24 October 2007 13:27
		Re: 16 bits wchar By: cbpporter on Wed, 24 October 2007 14:05
		Re: 16 bits wchar By: cbpporter on Thu, 25 October 2007 14:47
		Re: 16 bits wchar By: mirek on Sat, 27 October 2007 11:11
		Re: 16 bits wchar By: cbpporter on Tue, 06 November 2007 13:31
		Re: 16 bits wchar By: mirek on Fri, 09 November 2007 10:39
		Re: 16 bits wchar By: cbpporter on Sat, 10 November 2007 17:34
		Re: 16 bits wchar By: mirek on Sun, 11 November 2007 18:45
		Re: 16 bits wchar By: cbpporter on Fri, 04 July 2008 17:12
		Re: 16 bits wchar By: cbpporter on Wed, 23 July 2008 15:22
		Re: 16 bits wchar By: mirek on Wed, 23 July 2008 22:04
		Re: 16 bits wchar By: cbpporter on Sat, 02 August 2008 13:27
		Re: 16 bits wchar By: cbpporter on Sat, 02 August 2008 18:34
		Re: 16 bits wchar By: cbpporter on Sat, 02 August 2008 19:01
		Re: 16 bits wchar By: cbpporter on Sun, 03 August 2008 14:51
		Re: 16 bits wchar By: mirek on Mon, 04 August 2008 15:07
		Re: 16 bits wchar By: cbpporter on Mon, 04 August 2008 15:53
		Re: 16 bits wchar By: mirek on Mon, 04 August 2008 17:14
		Re: 16 bits wchar By: cbpporter on Mon, 04 August 2008 22:47
		Re: 16 bits wchar By: mirek on Tue, 05 August 2008 00:03
		Re: 16 bits wchar By: cbpporter on Tue, 05 August 2008 00:12
		Re: 16 bits wchar By: mirek on Tue, 05 August 2008 00:14
		Re: 16 bits wchar By: cbpporter on Tue, 05 August 2008 00:18
		Re: 16 bits wchar By: mirek on Tue, 05 August 2008 00:20
		Re: 16 bits wchar By: cbpporter on Tue, 05 August 2008 00:24
		Re: 16 bits wchar By: mirek on Tue, 05 August 2008 00:26
		Re: 16 bits wchar By: cbpporter on Tue, 05 August 2008 00:32
		Re: 16 bits wchar By: mirek on Tue, 05 August 2008 00:51
		Re: 16 bits wchar By: mirek on Tue, 05 August 2008 10:42
		Re: 16 bits wchar By: cbpporter on Tue, 05 August 2008 12:03
		Re: 16 bits wchar By: mirek on Tue, 05 August 2008 15:12
		Re: 16 bits wchar By: mirek on Tue, 05 August 2008 15:19
		Re: 16 bits wchar By: cbpporter on Tue, 05 August 2008 15:57
		Re: 16 bits wchar By: cbpporter on Wed, 06 August 2008 13:33
		Re: 16 bits wchar By: cbpporter on Thu, 07 August 2008 08:41
		Re: 16 bits wchar By: mirek on Thu, 07 August 2008 16:10
		Re: 16 bits wchar By: cbpporter on Thu, 07 August 2008 17:33
		Re: 16 bits wchar By: mirek on Thu, 07 August 2008 17:40
		Re: 16 bits wchar By: cbpporter on Thu, 07 August 2008 18:37
		Re: 16 bits wchar By: mirek on Thu, 07 August 2008 20:01
		Re: 16 bits wchar By: cbpporter on Fri, 08 August 2008 13:34
		Re: 16 bits wchar By: mirek on Fri, 08 August 2008 15:32
		Re: 16 bits wchar By: cbpporter on Fri, 08 August 2008 15:47
		Re: 16 bits wchar By: mirek on Fri, 08 August 2008 18:25
		Re: 16 bits wchar By: cbpporter on Sat, 09 August 2008 01:45
		Re: 16 bits wchar By: cbpporter on Fri, 05 September 2008 19:13
		Re: 16 bits wchar By: mirek on Sun, 07 September 2008 13:24
		Re: 16 bits wchar By: mirek on Mon, 04 August 2008 15:03
		Re: 16 bits wchar By: mirek on Sat, 27 October 2007 11:01

Previous Topic:	Arabic words from file
Next Topic:	Not possible to get .t files

Goto Forum:

-=] Back to Top [=-

[ Syndicate this forum (XML) ] [

] [

PDF

]

Current Time: Mon Dec 01 00:22:53 CET 2025

Total time taken to generate the page: 0.07407 seconds