U++ forum: Welcome to the forum

Status & Roadmap

Authors & License

Funding Ultimate++

Search on this site

Search in forums

Home » Developing U++ » U++ Developers corner » Choosing the best way to go full UNICODE

Show: Today's Messages :: Show Polls :: Message Navigator
E-mail to friend

Return to the default flat view

Create a new topic

Submit Reply

Re: Choosing the best way to go full UNICODE [message #48167 is a reply to message #48166]

Tue, 30 May 2017 11:23

cbpporter is currently offline

cbpporter
Messages: 1401
Registered: September 2007

Ultimate Contributor

I meant code point.

Get is the code point.

And what I claimed is that having code point = code unit only half-way solves the problem and you get the least favorite and advantageous encoding in DString/Utf32.

The index of Utf code-point can be easily solved with a string walker that can work for any encoding, from utf8 to 32 and there is no need to tie yourself down with DString supremacy.

Because once you reach "full" Unicode support with DString, you know that most of the non-opaque string code in U++ will use the DString implementation and everything will be biased towards 32 bit.

I prefer 1-3 string classes with an 8 bit bias and the same amount of "String walker" classes and once one implements this, I believe in practice the amount of switching from one encoding to another gets minimized.

I my library I only have String, with plans to add WString someday, and the complexity of Utf8 has never made me wish for DString. Conversion to Utf16 only happens in WinAPI context anyway. Plus on the web, Utf8 is the standard. You will probably find that across the globe, on average, Utf8 is still the smaller method of storage. And in CJK context, Utf16 makes a lot of sense, even with occasional surrogate pairs. Historic/academic CJK is where Utf32 shines, but still, in such contexts, Utf is not the most popular.

Anyway, I would go with the same approach I went so many years ago.

Conversion from Utf8 to Ucs2 must be upgraded to Unicode 9.0 compliant Utf8 to Utf16. Utf8 1-4 code units must be converted to a single code point. Correct escaping and error recovery must be implemented. The Unicode standard give you exact deterministic outcomes for any illformated sequences, and combined with the EE you already have in U++, you can get non-destructive error handling. This will leave you with the ability to read and write Unicode. The rest of U++ won't be helped by this, but it is still an important step and will solve a few problems.

Then, one by one, I would convert pieces of code over to a standardized traversal mechanic, be it a string walker class or something else you come up with.

Report message to a moderator

Send a private message to this user

[Message index]

		Choosing the best way to go full UNICODE By: mirek on Sat, 27 May 2017 16:58
		Re: Choosing the best way to go full UNICODE By: Zbych on Sat, 27 May 2017 20:02
		Re: Choosing the best way to go full UNICODE By: mirek on Sat, 27 May 2017 20:48
		Re: Choosing the best way to go full UNICODE By: mirek on Sat, 27 May 2017 20:51
		Re: Choosing the best way to go full UNICODE By: cbpporter on Mon, 29 May 2017 14:36
		Re: Choosing the best way to go full UNICODE By: mirek on Mon, 29 May 2017 19:40
		Re: Choosing the best way to go full UNICODE By: cbpporter on Tue, 30 May 2017 10:31
		Re: Choosing the best way to go full UNICODE By: mirek on Tue, 30 May 2017 11:03
		Re: Choosing the best way to go full UNICODE By: cbpporter on Tue, 30 May 2017 11:23
		Re: Choosing the best way to go full UNICODE By: mirek on Tue, 30 May 2017 11:45
		Re: Choosing the best way to go full UNICODE By: cbpporter on Wed, 31 May 2017 10:30
		Re: Choosing the best way to go full UNICODE By: mirek on Wed, 31 May 2017 11:00
		Re: Choosing the best way to go full UNICODE By: cbpporter on Wed, 31 May 2017 11:30
		Re: Choosing the best way to go full UNICODE By: mirek on Wed, 31 May 2017 12:07
		Re: Choosing the best way to go full UNICODE By: cbpporter on Wed, 31 May 2017 12:26
		Re: Choosing the best way to go full UNICODE By: mirek on Wed, 31 May 2017 12:40
		Re: Choosing the best way to go full UNICODE By: cbpporter on Wed, 31 May 2017 13:12
		Re: Choosing the best way to go full UNICODE By: mirek on Wed, 31 May 2017 13:20
		Re: Choosing the best way to go full UNICODE By: cbpporter on Wed, 31 May 2017 13:43
		Re: Choosing the best way to go full UNICODE By: mirek on Wed, 31 May 2017 14:41
		Re: Choosing the best way to go full UNICODE By: cbpporter on Wed, 31 May 2017 15:06
		Re: Choosing the best way to go full UNICODE By: mirek on Wed, 31 May 2017 15:25
		Re: Choosing the best way to go full UNICODE By: mirek on Wed, 31 May 2017 15:34
		Re: Choosing the best way to go full UNICODE By: cbpporter on Wed, 31 May 2017 15:38
		Re: Choosing the best way to go full UNICODE By: mirek on Wed, 31 May 2017 15:50
		Re: Choosing the best way to go full UNICODE By: mirek on Mon, 05 June 2017 17:51
		Re: Choosing the best way to go full UNICODE By: cbpporter on Tue, 06 June 2017 09:28
		Re: Choosing the best way to go full UNICODE By: mirek on Tue, 06 June 2017 10:41
		Re: Choosing the best way to go full UNICODE By: cbpporter on Tue, 06 June 2017 11:18
		Re: Choosing the best way to go full UNICODE By: mirek on Tue, 06 June 2017 13:21
		Re: Choosing the best way to go full UNICODE By: cbpporter on Tue, 06 June 2017 13:39
		Re: Choosing the best way to go full UNICODE By: cbpporter on Tue, 06 June 2017 13:58
		Re: Choosing the best way to go full UNICODE By: cbpporter on Thu, 08 June 2017 10:00
		Re: Choosing the best way to go full UNICODE By: mirek on Thu, 08 June 2017 10:26
		Re: Choosing the best way to go full UNICODE By: cbpporter on Thu, 08 June 2017 10:43
		Re: Choosing the best way to go full UNICODE By: cbpporter on Thu, 08 June 2017 11:22
		Re: Choosing the best way to go full UNICODE By: cbpporter on Thu, 08 June 2017 13:00
		Re: Choosing the best way to go full UNICODE By: mirek on Sun, 11 June 2017 13:57
		Re: Choosing the best way to go full UNICODE By: cbpporter on Mon, 12 June 2017 09:39
		Re: Choosing the best way to go full UNICODE By: mirek on Mon, 12 June 2017 10:13
		Re: Choosing the best way to go full UNICODE By: cbpporter on Mon, 12 June 2017 10:21
		Re: Choosing the best way to go full UNICODE By: mirek on Mon, 12 June 2017 10:28
		Re: Choosing the best way to go full UNICODE By: mirek on Mon, 12 June 2017 10:31
		Re: Choosing the best way to go full UNICODE By: cbpporter on Mon, 12 June 2017 10:53
		Re: Choosing the best way to go full UNICODE By: mirek on Mon, 12 June 2017 10:57
		Re: Choosing the best way to go full UNICODE By: cbpporter on Mon, 12 June 2017 11:37
		Re: Choosing the best way to go full UNICODE By: mirek on Mon, 12 June 2017 11:41
		Re: Choosing the best way to go full UNICODE By: cbpporter on Mon, 12 June 2017 12:50
		Re: Choosing the best way to go full UNICODE By: cbpporter on Mon, 12 June 2017 13:06
		Re: Choosing the best way to go full UNICODE By: cbpporter on Mon, 12 June 2017 14:20
		Re: Choosing the best way to go full UNICODE By: cbpporter on Tue, 13 June 2017 16:31
		Re: Choosing the best way to go full UNICODE By: cbpporter on Wed, 14 June 2017 11:07
		Re: Choosing the best way to go full UNICODE By: cbpporter on Wed, 14 June 2017 12:07
		Re: Choosing the best way to go full UNICODE By: mirek on Wed, 14 June 2017 12:30
		Re: Choosing the best way to go full UNICODE By: cbpporter on Wed, 14 June 2017 12:42
		Re: Choosing the best way to go full UNICODE By: mirek on Wed, 14 June 2017 19:09
		Re: Choosing the best way to go full UNICODE By: cbpporter on Wed, 14 June 2017 23:19
		Re: Choosing the best way to go full UNICODE By: cbpporter on Wed, 14 June 2017 23:31
		Re: Choosing the best way to go full UNICODE By: mirek on Mon, 19 June 2017 10:03
		Re: Choosing the best way to go full UNICODE By: cbpporter on Mon, 19 June 2017 10:22
		Re: Choosing the best way to go full UNICODE By: mirek on Mon, 19 June 2017 10:40
		Re: Choosing the best way to go full UNICODE By: cbpporter on Mon, 19 June 2017 10:51
		Re: Choosing the best way to go full UNICODE By: mirek on Mon, 19 June 2017 10:58
		Re: Choosing the best way to go full UNICODE By: cbpporter on Mon, 19 June 2017 11:07
		Re: Choosing the best way to go full UNICODE By: cbpporter on Mon, 19 June 2017 10:23
		Re: Choosing the best way to go full UNICODE By: mirek on Wed, 14 June 2017 12:17

Previous Topic:	Some addition proposals
Next Topic:	Help needed with link errors (serversocket)

Goto Forum:

-=] Back to Top [=-

[ Syndicate this forum (XML) ] [

] [

PDF

]

Current Time: Thu May 16 04:40:28 CEST 2024

Total time taken to generate the page: 0.03256 seconds