Home » U++ Library support » U++ MT-multithreading and servers » A new function to Web Package Unicode-Escape-Javascript -> Unicode
A new function to Web Package Unicode-Escape-Javascript -> Unicode [message #33247] |
Wed, 20 July 2011 09:28  |
|
I propose to include in the package a new feature WEB Unicode-Escape-Javascript -> Unicode.
For international characters in Javascript is used for special encoding non-Latin characters. Looks like: \ u0410 \ u0422 \ u0417 ....
For converting this encoding to Unicode, I needed a new function. I propose that its AEs in the package a new feature WEB Unicode-Escape-Javascript -> Unicode.
For international characters in Javascript is used for special encoding non-Latin characters. Looks like: \ u0410 \ u0422 \ u0417 ....
For converting this encoding to Unicode, I needed a new function. I have not found it, so I wrote in haste.
Maybe someone else will need.
I also want to note that some of the operations inside the function can be optimized. For example the plural multiplying by 16 can be replaced by bit shift.
String Javascript2Unicode(String s) {
String res ;
RegExp reg("\\\\u([0-9a-f]{4})",RegExp::MULTILINE);
reg.Clear();
int i_start,i_end,i_old=0;
while (reg.GlobalMatch(s)) {
reg.GetMatchPos(0,i_start,i_end);
res << s.Mid(i_old,i_start-i_old-2);
wchar wc[2] = {0,0};
WString ws;
String m = (String)reg[0];
wc[0] = StrInt(m.Mid(0,1))*16*16*16
+StrInt(m.Mid(1,1))*16*16
+StrInt(m.Mid(2,1))*16
+StrInt(m.Mid(3,1));
ws = wc;
String ss = ws.ToString();
res << ss;
i_old = i_end;
}
return res;
}
PS
Maybe there is already something like that? And I wasted 4 hours? Advise me if you know.
SergeyNikitin<U++>( linux, wine )
{
    under( Ubuntu || Debian || Raspbian );
}
|
|
|
|
Re: A new function to Web Package Unicode-Escape-Javascript -> Unicode [message #33257 is a reply to message #33256] |
Wed, 20 July 2011 21:01   |
|
I propose to extend function UrlDecode with form \uXXXX additionally to %uXXXX enciding. Form \uXXXX is reserved for old browsers and it is standard like form %uXXXX.
Interesting Online Encodings converter:
http://rishida.net/tools/conversion/
pls apply patch to Function:
String UrlDecode(const char *b, const char *e)
{
StringBuffer out;
byte d1, d2, d3, d4;
for(const char *p = b; p < e; p++)
if(*p == '+')
out.Cat(' ');
else if(*p == '%' && (d1 = ctoi(p[1])) < 16 && (d2 = ctoi(p[2])) < 16) {
out.Cat(d1 * 16 + d2);
p += 2;
}
else if(*p == '%' && (p[1] == 'u' || p[1] == 'U')
&& (d1 = ctoi(p[2])) < 16 && (d2 = ctoi(p[3])) < 16
&& (d3 = ctoi(p[4])) < 16 && (d4 = ctoi(p[5])) < 16) {
out.Cat(WString((d1 << 12) | (d2 << 8) | (d3 << 4) | d4, 1).ToString());
p += 5;
}
else
out.Cat(*p);
return out;
}
I propose change like this:
String UrlDecode(const char *b, const char *e)
{
StringBuffer out;
byte d1, d2, d3, d4;
for(const char *p = b; p < e; p++)
if(*p == '+')
out.Cat(' ');
else if(*p == '%' && (d1 = ctoi(p[1])) < 16 && (d2 = ctoi(p[2])) < 16) {
out.Cat(d1 * 16 + d2);
p += 2;
}
else if((*p == '%' || *p == '\') && (p[1] == 'u' || p[1] == 'U') // <-This line changed
&& (d1 = ctoi(p[2])) < 16 && (d2 = ctoi(p[3])) < 16
&& (d3 = ctoi(p[4])) < 16 && (d4 = ctoi(p[5])) < 16) {
out.Cat(WString((d1 << 12) | (d2 << 8) | (d3 << 4) | d4, 1).ToString());
p += 5;
}
else
out.Cat(*p);
return out;
}
SergeyNikitin<U++>( linux, wine )
{
    under( Ubuntu || Debian || Raspbian );
}
|
|
|
|
Re: A new function to Web Package Unicode-Escape-Javascript -> Unicode [message #33259 is a reply to message #33258] |
Wed, 20 July 2011 22:22  |
Sender Ghost
Messages: 301 Registered: November 2008
|
Senior Member |
|
|
sergeynikitin wrote on Wed, 20 July 2011 21:13 |
Because logic of Javascript's Unescape a bit different: It don't replace "+" with "space".
|
Because UrlEncode/UrlDecode functions (also, from the meaning of function names) used for URL(s). I also think, different functions needed for content. May be, with general implementation.
From UrlDecode you could see, as you said, function optimizations and shifts of bits, instead of using regular expressions.
References:
- Some implementation of UrlEncode/UrlDecode.
- About UrlEncode on Wikipedia and why " " (space) converted to "+" instead of "%20" on early stage(s).
[Updated on: Thu, 21 July 2011 05:39] Report message to a moderator
|
|
|
Goto Forum:
Current Time: Sat May 10 00:26:48 CEST 2025
Total time taken to generate the page: 0.03137 seconds
|