Overview
Examples
Screenshots
Comparisons
Applications
Download
Documentation
Tutorials
Bazaar
Status & Roadmap
FAQ
Authors & License
Forums
Funding Ultimate++
Search on this site
Search in forums












SourceForge.net Logo
Home » Developing U++ » U++ Developers corner » JavaScriptCore
Re: JavaScriptCore [message #27826 is a reply to message #27585] Thu, 05 August 2010 12:55 Go to previous messageGo to next message
mirek is currently offline  mirek
Messages: 13975
Registered: November 2005
Ultimate Member
dolik.rce wrote on Sat, 24 July 2010 09:26


Heap leaks in webkits code. (Is there some way to disable them for part of the code?)



Well, we cannot disable leaks, but we have means to disable the checking Smile

MemoryIgnoreLeaksBlock __;

- any blocks allocated till the end of scope will not be considered leaks if still allocated at the program exit.

Mirek
Re: JavaScriptCore [message #27828 is a reply to message #27826] Thu, 05 August 2010 14:00 Go to previous messageGo to next message
dolik.rce is currently offline  dolik.rce
Messages: 1789
Registered: August 2008
Location: Czech Republic
Ultimate Contributor

luzr wrote on Thu, 05 August 2010 12:55

dolik.rce wrote on Sat, 24 July 2010 09:26


Heap leaks in webkits code. (Is there some way to disable them for part of the code?)



Well, we cannot disable leaks, but we have means to disable the checking Smile

MemoryIgnoreLeaksBlock __;

- any blocks allocated till the end of scope will not be considered leaks if still allocated at the program exit.

Mirek


Well, but wouldn't it be nice if U++ could disable leaks just by piece of code? Very Happy

Anyway thank you for the hint. Placed in constructor of the wrapper works perfectly. I knew something like this existed, I just couldn't find it...

Honza

Re: JavaScriptCore [message #27962 is a reply to message #27828] Thu, 12 August 2010 09:55 Go to previous messageGo to next message
cbpporter is currently offline  cbpporter
Messages: 1401
Registered: September 2007
Ultimate Contributor
I have investigated further the problem of ToUpper. After looking over the tables we have in U++, they have even better coverage than I thought. Those unit tests that failed must have been extremely thorough to have reported such a failure. So I figured that maybe not the limited coverage of the 2048 characters is the problem, rather maybe the table has some errors.

And indeed, there are some errors.

E.G. Character "ƀ", 384, "Latin Small Letter B with stroke" has a ToUpper in U++ of 384, the same character. This is clearly wrong. The correct upper value is "Ƀ", 579, "Latin Capital Letter B with stroke".

After some superficial testing, running the current ToUpper/ToLower on the whole 65536 range I have found 568/569 errors. This is really great news, seeing as the table only covers 2048 characters. This means that most of Unicode is case agnostic and we can get away with good support without using huge tables. Running it on only 2048 characters, I have found 50/43. I hope I used the correct testing method.

So if somebody can point me to the bit setup of uni__info, I could correct the values for the first 2048 characters. I can figure out most, but it would be better if I could get the real layout of that packed bitfield. If there are any free bits left, I have some data that I would like to store there, like if the character is punctuation, if it is Latin, etc.
Re: JavaScriptCore [message #27970 is a reply to message #27962] Thu, 12 August 2010 15:40 Go to previous messageGo to next message
dolik.rce is currently offline  dolik.rce
Messages: 1789
Registered: August 2008
Location: Czech Republic
Ultimate Contributor

cbpporter wrote on Thu, 12 August 2010 09:55

After some superficial testing, running the current ToUpper/ToLower on the whole 65536 range I have found 568/569 errors. This is really great news, seeing as the table only covers 2048 characters. This means that most of Unicode is case agnostic and we can get away with good support without using huge tables. Running it on only 2048 characters, I have found 50/43. I hope I used the correct testing method.


Wow. Just wow... I would expect much more errors too.

If I remember correctly, most of the troubles were with characters around 60k. You can run the webkit tests yourself, the jsc_test package (I think I uploaded it last time) is compatible. I will try to remember how I did it and send you the instructions if you want...

Having proper category info in the uni__info would be great too. Maybe even in separate field if it doesn't fit in this one. I just hope someone remembers what is the structure so we can find out.

Honza
Re: JavaScriptCore [message #27992 is a reply to message #27962] Fri, 13 August 2010 09:09 Go to previous messageGo to next message
mirek is currently offline  mirek
Messages: 13975
Registered: November 2005
Ultimate Member
cbpporter wrote on Thu, 12 August 2010 03:55


After some superficial testing, running the current ToUpper/ToLower on the whole 65536 range I have found 568/569 errors. This is really great news, seeing as the table only covers 2048 characters. This means that most of Unicode is case agnostic and we can get away with good support without using huge tables. Running it on only 2048 characters, I have found 50/43. I hope I used the correct testing method.

So if somebody can point me to the bit setup of uni__info, I could correct the values for the first 2048 characters. I can figure out most, but it would be better if I could get the real layout of that packed bitfield. If there are any free bits left, I have some data that I would like to store there, like if the character is punctuation, if it is Latin, etc.


I would say this defines it pretty well:

bool IsLetter(int c)        { return (dword)c < 2048 ? uni__info[c] & 0xc0000000 : 0; }
bool IsUpper(int c)         { return (dword)c < 2048 ? uni__info[c] & 0x40000000 : 0; }
bool IsLower(int c)         { return (dword)c < 2048 ? uni__info[c] & 0x80000000 : 0; }
int  ToUpper(int c)         { return (dword)c < 2048 ? (uni__info[c] >> 11) & 2047 : c; }
int  ToLower(int c)         { return (dword)c < 2048 ? uni__info[c] & 2047 : c; }
int  ToAscii(int c)         { return (dword)c < 2048 ? (uni__info[c] >> 22) & 0x7f : 0; }


If bits are 0..31, then (reading the code, please check me...:):

31 lower letter
30 upper letter
22-28 (7 bits) toascii
11-21 (11 bits) toupper
0-10 (11 bits) tolower

Looks like bit 29 is now free, if I am counting well..

It would be nice to know, after fixing <2048, what are codepoints of those more errors - perhaps we could handle them too...
Re: JavaScriptCore [message #28010 is a reply to message #27992] Fri, 13 August 2010 10:46 Go to previous messageGo to next message
cbpporter is currently offline  cbpporter
Messages: 1401
Registered: September 2007
Ultimate Contributor
luzr wrote on Fri, 13 August 2010 10:09


22-28 (7 bits) toascii


Great, now I have to check the ASCII codes too Smile. But I could use this info in my Linux font escaper, where I have a separate table for it.

Quote:


It would be nice to know, after fixing <2048, what are codepoints of those more errors - perhaps we could handle them too...


That will not be a problem. Actually, Unicode 6.0 is very close, so I will be trying to get as many such unintrusive fixes as possible.

The hardest part is to figure out a very compact bit layout and get as much of the Unicode Database encoded. And I'm afraid I am going to need a little more than 1 bit. Also, the ratio of compactness/performance is important. Maybe latter we'll need something like (of the top of my head):
int  ToUpper(int c)         { return (dword)c < 2048 ? (uni__info[c] >> 11) & 2047 : ((dword)c > 50000 && (dword) c < 50512] ? (uni__info[c - 50000] >> 11) & 2047: c); }

to handle those extra errors and we want to avoid an 64Ki table. Would such a performance penalty be acceptable? Or maybe we'll have a 64Ki 1 byte table with properties for characters, and extra 4 bytes for characters that need special case information. Even today, the last character that has meaningful lower/uppercase data is 1414. There are at least 1982 characters with case information, 938 in the first 2048.

Of course, this is just speculation for the future, right now I'm only concerned with keeping the current layout for uni__info, but fixing the 93 errored codes.
Re: JavaScriptCore [message #28013 is a reply to message #28010] Fri, 13 August 2010 10:48 Go to previous messageGo to next message
cbpporter is currently offline  cbpporter
Messages: 1401
Registered: September 2007
Ultimate Contributor
PS: How did you get the table? I'm autogenerating it from Unicode Database.
Re: JavaScriptCore [message #28028 is a reply to message #28010] Fri, 13 August 2010 11:30 Go to previous messageGo to next message
mirek is currently offline  mirek
Messages: 13975
Registered: November 2005
Ultimate Member
cbpporter wrote on Fri, 13 August 2010 04:46

luzr wrote on Fri, 13 August 2010 10:09


22-28 (7 bits) toascii


Great, now I have to check the ASCII codes too Smile. But I could use this info in my Linux font escaper, where I have a separate table for it.



Just to be sure what toascii means

È -> C
Re: JavaScriptCore [message #28029 is a reply to message #28013] Fri, 13 August 2010 11:30 Go to previous messageGo to next message
mirek is currently offline  mirek
Messages: 13975
Registered: November 2005
Ultimate Member
cbpporter wrote on Fri, 13 August 2010 04:48

PS: How did you get the table? I'm autogenerating it from Unicode Database.


The same way.
Re: JavaScriptCore [message #28030 is a reply to message #28028] Fri, 13 August 2010 11:36 Go to previous messageGo to next message
cbpporter is currently offline  cbpporter
Messages: 1401
Registered: September 2007
Ultimate Contributor
luzr wrote on Fri, 13 August 2010 12:30

cbpporter wrote on Fri, 13 August 2010 04:46

luzr wrote on Fri, 13 August 2010 10:09


22-28 (7 bits) toascii


Great, now I have to check the ASCII codes too Smile. But I could use this info in my Linux font escaper, where I have a separate table for it.



Just to be sure what toascii means

È -> C


I don't understand? It is not "E"?
Re: JavaScriptCore [message #28040 is a reply to message #28030] Fri, 13 August 2010 15:28 Go to previous messageGo to next message
cbpporter is currently offline  cbpporter
Messages: 1401
Registered: September 2007
Ultimate Contributor
There are error with the IsUpper field also.

This is getting quite confusing.

I think that we need to change the meaning of these fields. Take for example the character dot ".". This is a punctuation mark, and as a code point IsLower is false and IsUpper is false. Yet, as a string, IsLower and IsUpper are both true. Unicode defines isLower(b): b == toLower(b). So the string "abc.=/123" is lowercase, and the string "ABC.=/123" is upper case.

If we create:
bool  IsSmthLower(int c)         { return (dword)c < 2048 ? (uni__info[c] & 2047)  == c: true; }

for testing the "logical"/"human interpretable" case of a character in string and keep the old IsLower to refer only to a code point's property of representing a particular cased letter, we can kill two flies with the same stone or how the expression goes. It will also work for the case of a "small a with a superscript capital H in the upper right corner" sequence (which is lowercase). The small a will have a code point property of lower as true, and it's lower case variant is the same. The capital H superscript has it's property as false, more precisely N/A, and it's lower case variant is again the same character.
Re: JavaScriptCore [message #28043 is a reply to message #28040] Fri, 13 August 2010 16:30 Go to previous messageGo to next message
cbpporter is currently offline  cbpporter
Messages: 1401
Registered: September 2007
Ultimate Contributor
Well here is corrected table, but only with the IsUpper field. There should be exactly 38 differences. I'll recheck the table once more, but everything should be OK. Anyway, nobody noticed that something was wrong before, and I doubt that this correction will impact code for better of worse, unless you are doing an Unicode unit test or live in a very specific part of the world and are using U++ Smile.

I don't have access to SVN right now and I'll be out of town during the weekend so I'm putting it here.
  • Attachment: b.txt
    (Size: 22.78KB, Downloaded 245 times)
Re: JavaScriptCore [message #28081 is a reply to message #27274] Sat, 14 August 2010 21:19 Go to previous messageGo to next message
jeremy_c is currently offline  jeremy_c
Messages: 175
Registered: August 2007
Location: Ohio, USA
Experienced Member
Is JavaScriptCore looking to replace ESC? I'm writing an app now that allows the user to perform certain tasks using ESC.

JavaScript is known by a lot more people than ESC, although it's very easy to pickup on. I would imagine, though, that JavaScript is faster and contains a much larger standard library.

Jeremy
Re: JavaScriptCore [message #28085 is a reply to message #28081] Sat, 14 August 2010 23:40 Go to previous messageGo to next message
dolik.rce is currently offline  dolik.rce
Messages: 1789
Registered: August 2008
Location: Czech Republic
Ultimate Contributor

jeremy_c wrote on Sat, 14 August 2010 21:19

Is JavaScriptCore looking to replace ESC? I'm writing an app now that allows the user to perform certain tasks using ESC.

JavaScript is known by a lot more people than ESC, although it's very easy to pickup on. I would imagine, though, that JavaScript is faster and contains a much larger standard library.

Jeremy

Hi Jeremy,

I wouldn't say replace. I believe they will live together side by side Smile It is up to you which language you choose. I for example prefer ESC in most cases, especially for it's small size and because you can easily interact with the functions and variables from u++ code. On the other hand, as you say, JavaScript is generally known, and even though ESC syntax is easy, it takes few minutes to learn.

So it is up to you and your needs.

Honza
Re: JavaScriptCore [message #28112 is a reply to message #28085] Tue, 17 August 2010 09:29 Go to previous messageGo to next message
cbpporter is currently offline  cbpporter
Messages: 1401
Registered: September 2007
Ultimate Contributor
IsLower codes fixed, 44 differences.
  • Attachment: c.txt
    (Size: 22.78KB, Downloaded 226 times)
Re: JavaScriptCore [message #28113 is a reply to message #28112] Tue, 17 August 2010 10:05 Go to previous messageGo to next message
cbpporter is currently offline  cbpporter
Messages: 1401
Registered: September 2007
Ultimate Contributor
I have fixed the 50 ToUpper codes.

Yet, there are still 8 codes that are wrong. I can not fit the correct value in 11 bits. I need at least 14 bits for these 8 values, and maybe more for other codes.

I'll postpone the upload of fixes to ToLower until we see how we'll proceed with these 8 codes. I propose that we use two tables. One with 2048 1 byte values that encode only properties. Another with a dword for lower/upper variants. Maybe we can optimize the storage a little. Anyway, 2KiB extra RAM is not that much.
  • Attachment: d.txt
    (Size: 22.78KB, Downloaded 306 times)
Re: JavaScriptCore [message #28127 is a reply to message #28113] Tue, 17 August 2010 22:22 Go to previous message
dolik.rce is currently offline  dolik.rce
Messages: 1789
Registered: August 2008
Location: Czech Republic
Ultimate Contributor

Hi cbporter!

I downloaded your last fix and compiled the jsc test app. I think the results are much better now, but still there is few failures. The attachment contains the relevant results and links to the performed tests. Maybe it will give you some idea where to look. Most errors is in 65K+ region, but two of them are under 2048.

Honza
  • Attachment: results.html
    (Size: 10.09KB, Downloaded 348 times)
Previous Topic: MultipartForm Class for use w/HttpClient
Next Topic: Disable library functions
Goto Forum:
  


Current Time: Fri Mar 29 15:19:34 CET 2024

Total time taken to generate the page: 0.01480 seconds