Home » U++ Library support » U++ Libraries and TheIDE: i18n, Unicode and Internationalization » 16 bits wchar
Re: 16 bits wchar [message #17253 is a reply to message #17248] |
Mon, 04 August 2008 22:47   |
cbpporter
Messages: 1427 Registered: September 2007
|
Ultimate Contributor |
|
|
luzr wrote on Mon, 04 August 2008 18:14 |
Well, this rather sound like we should kick out WString altogether and keep just UTF-8:)
|
I think that we should keep both, and even add LString eventually just for the sake of completeness. In other package if your worried about exe size.
Quote: |
Maybe we just need smarter encoding than UNICODE? 
Makes me think - realistically, there is a lot of "reserved" positions in BMP. Could we just use them for this?
|
Well there is nothing better than Unicode AFAIK. It may seem sometimes like there is too much fuss with it, but if you are in my place and have to deal with other legacy encodings, you would have to deal with EUC, EUC-JP, ShiftJIS, JIS and a couple of ISOs, where a lot of these encoding don't guarantee round-trip conversion, and you'll see that Unicode is a true blessing. Great that I have iConv to ease the burden a little.
And BTW, Unicode forbids the use of the reserved or unassigned code points for any use .
Anyway I ran my benchmarks on my Windows machine where console output still works. I did the tests with some experimental methods which are not complete, so the results could be a little inaccurate, but they are still interesting enough too post.
I used 3 methods to convert from a two UTF8 sets to UTF16. The first method is the standard U++ FromUtf8. The second is my FromUtf8SR, which takes into account 4 byte characters, and the third is the highly experimental FromUtf8SR2. The first data set consists of 200 latin characters, representing 200 code points (the letter c 200 times). The second one consists of 100 kanji, 3 characters each, totaling 300 bytes. On second thought, I should have used same sized data sets. All conversions are run 1000000 times.
In Debug mode:
Latin
3125
3203
2078
Kanji
3891
3906
2406
Nothing too impressive here. First method, the standard one is a little faster than mine, and the experimental one is considerably faster.
In Release mode:
Latin
484
485
390
Kanji
4718
3157
812
Here, for kanji, my method really is a lot faster. But in release mode, FromUtf8 for an all kanji input is slower than in Debug mode. Can someone verify this? Maybe I messed something up.
As I said my experimental method is really experimental and not complete yet (I hope it is thread safe also). I hope I'm not chasing after wild geese (is that an expression?) and I didn't miss something that should render my experimental method useless or wrong, because the numbers are great!
|
|
|
 |
|
16 bits wchar
By: riri on Mon, 05 February 2007 17:19
|
 |
|
Re: 16 bits wchar
By: mirek on Mon, 05 February 2007 23:07
|
 |
|
Re: 16 bits wchar
By: cbpporter on Tue, 25 September 2007 22:03
|
 |
|
Re: 16 bits wchar
By: mirek on Tue, 25 September 2007 23:18
|
 |
|
Re: 16 bits wchar
By: cbpporter on Wed, 26 September 2007 07:43
|
 |
|
Re: 16 bits wchar
By: mirek on Wed, 26 September 2007 08:48
|
 |
|
Re: 16 bits wchar
By: sergei on Wed, 26 September 2007 14:55
|
 |
|
Re: 16 bits wchar
By: cbpporter on Wed, 26 September 2007 15:37
|
 |
|
Re: 16 bits wchar
By: mirek on Wed, 26 September 2007 22:40
|
 |
|
Re: 16 bits wchar
|
 |
|
Re: 16 bits wchar
By: mirek on Mon, 01 October 2007 14:28
|
 |
|
Re: 16 bits wchar
|
 |
|
Re: 16 bits wchar
By: mirek on Wed, 03 October 2007 10:11
|
 |
|
Re: 16 bits wchar
|
 |
|
Re: 16 bits wchar
By: mirek on Wed, 03 October 2007 10:42
|
 |
|
Re: 16 bits wchar
By: mirek on Wed, 03 October 2007 10:26
|
 |
|
Re: 16 bits wchar
|
 |
|
Re: 16 bits wchar
By: mirek on Wed, 03 October 2007 12:10
|
 |
|
Re: 16 bits wchar
|
 |
|
Re: 16 bits wchar
By: mirek on Wed, 03 October 2007 21:40
|
 |
|
Re: 16 bits wchar
|
 |
|
Re: 16 bits wchar
By: mirek on Thu, 04 October 2007 17:33
|
 |
|
Re: 16 bits wchar
|
 |
|
Re: 16 bits wchar
|
 |
|
Re: 16 bits wchar
|
 |
|
Re: 16 bits wchar
By: mirek on Fri, 12 October 2007 11:52
|
 |
|
Re: 16 bits wchar
By: mirek on Fri, 12 October 2007 11:59
|
 |
|
Re: 16 bits wchar
|
 |
|
Re: 16 bits wchar
|
 |
|
Re: 16 bits wchar
By: mirek on Fri, 12 October 2007 17:03
|
 |
|
Re: 16 bits wchar
|
 |
|
Re: 16 bits wchar
|
 |
|
Re: 16 bits wchar
|
 |
|
Re: 16 bits wchar
By: mirek on Sun, 21 October 2007 20:19
|
 |
|
Re: 16 bits wchar
|
 |
|
Re: 16 bits wchar
By: mirek on Sun, 21 October 2007 23:57
|
 |
|
Re: 16 bits wchar
|
 |
|
Re: 16 bits wchar
By: mirek on Mon, 22 October 2007 10:47
|
 |
|
Re: 16 bits wchar
|
 |
|
Re: 16 bits wchar
By: mirek on Mon, 22 October 2007 19:37
|
 |
|
Re: 16 bits wchar
By: mirek on Sun, 21 October 2007 20:14
|
 |
|
Re: 16 bits wchar
By: sergei on Wed, 26 September 2007 01:56
|
 |
|
Re: 16 bits wchar
By: sergei on Wed, 26 September 2007 16:54
|
 |
|
Re: 16 bits wchar
By: cbpporter on Wed, 26 September 2007 19:11
|
 |
|
Re: 16 bits wchar
|
 |
|
Re: 16 bits wchar
By: mirek on Wed, 24 October 2007 13:27
|
 |
|
Re: 16 bits wchar
|
 |
|
Re: 16 bits wchar
|
 |
|
Re: 16 bits wchar
By: mirek on Sat, 27 October 2007 11:11
|
 |
|
Re: 16 bits wchar
|
 |
|
Re: 16 bits wchar
By: mirek on Fri, 09 November 2007 10:39
|
 |
|
Re: 16 bits wchar
|
 |
|
Re: 16 bits wchar
By: mirek on Sun, 11 November 2007 18:45
|
 |
|
Re: 16 bits wchar
|
 |
|
Re: 16 bits wchar
|
 |
|
Re: 16 bits wchar
By: mirek on Wed, 23 July 2008 22:04
|
 |
|
Re: 16 bits wchar
|
 |
|
Re: 16 bits wchar
|
 |
|
Re: 16 bits wchar
|
 |
|
Re: 16 bits wchar
|
 |
|
Re: 16 bits wchar
By: mirek on Mon, 04 August 2008 15:07
|
 |
|
Re: 16 bits wchar
|
 |
|
Re: 16 bits wchar
By: mirek on Mon, 04 August 2008 17:14
|
 |
|
Re: 16 bits wchar
|
 |
|
Re: 16 bits wchar
By: mirek on Tue, 05 August 2008 00:03
|
 |
|
Re: 16 bits wchar
|
 |
|
Re: 16 bits wchar
By: mirek on Tue, 05 August 2008 00:14
|
 |
|
Re: 16 bits wchar
|
 |
|
Re: 16 bits wchar
By: mirek on Tue, 05 August 2008 00:20
|
 |
|
Re: 16 bits wchar
|
 |
|
Re: 16 bits wchar
By: mirek on Tue, 05 August 2008 00:26
|
 |
|
Re: 16 bits wchar
|
 |
|
Re: 16 bits wchar
By: mirek on Tue, 05 August 2008 00:51
|
 |
|
Re: 16 bits wchar
By: mirek on Tue, 05 August 2008 10:42
|
 |
|
Re: 16 bits wchar
|
 |
|
Re: 16 bits wchar
By: mirek on Tue, 05 August 2008 15:12
|
 |
|
Re: 16 bits wchar
By: mirek on Tue, 05 August 2008 15:19
|
 |
|
Re: 16 bits wchar
|
 |
|
Re: 16 bits wchar
|
 |
|
Re: 16 bits wchar
|
 |
|
Re: 16 bits wchar
By: mirek on Thu, 07 August 2008 16:10
|
 |
|
Re: 16 bits wchar
|
 |
|
Re: 16 bits wchar
By: mirek on Thu, 07 August 2008 17:40
|
 |
|
Re: 16 bits wchar
|
 |
|
Re: 16 bits wchar
By: mirek on Thu, 07 August 2008 20:01
|
 |
|
Re: 16 bits wchar
|
 |
|
Re: 16 bits wchar
By: mirek on Fri, 08 August 2008 15:32
|
 |
|
Re: 16 bits wchar
|
 |
|
Re: 16 bits wchar
By: mirek on Fri, 08 August 2008 18:25
|
 |
|
Re: 16 bits wchar
|
 |
|
Re: 16 bits wchar
By: cbpporter on Fri, 05 September 2008 19:13
|
 |
|
Re: 16 bits wchar
By: mirek on Sun, 07 September 2008 13:24
|
 |
|
Re: 16 bits wchar
By: mirek on Mon, 04 August 2008 15:03
|
 |
|
Re: 16 bits wchar
By: mirek on Sat, 27 October 2007 11:01
|
Goto Forum:
Current Time: Thu Jul 03 17:16:57 CEST 2025
Total time taken to generate the page: 0.04501 seconds
|