Home » U++ Library support » U++ Core » String w/high characters but not UTF?
String w/high characters but not UTF? [message #27868] |
Sun, 08 August 2010 02:52 |
jeremy_c
Messages: 175 Registered: August 2007 Location: Ohio, USA
|
Experienced Member |
|
|
I am using a web service that returns data in a very old format (has endured since the DOS days!)... It's something like this:
FIELD1(char 181)FIELD2(char 182)FIELD3(char 184)FIELD4(char 185)
The (char 181) being actually chr(181). So, to parse you know you want the field starting with chr(183) ... for example ... You find 183, then search until 184 and take everything in between.
I seem to be having problems with U++ seeing that as a UTF string and doing weird things with it.
How can I prevent this? I am using HttpClient.Execute(); to get the content.
Jeremy
|
|
|
|
Re: String w/high characters but not UTF? [message #27906 is a reply to message #27902] |
Tue, 10 August 2010 14:29 |
jeremy_c
Messages: 175 Registered: August 2007 Location: Ohio, USA
|
Experienced Member |
|
|
When I parse the data. I create a small U++ app that shows what my problem is:
#include <Core/Core.h>
using namespace Upp;
CONSOLE_APP_MAIN
{
char data[] = { 65, 65, 65, 181, 65, 65, 65, 182, 65, 65, 183 };
String d(data);
for (int i=0; i < d.GetCount(); i++) {
LOG(FormatInt(i) + "=" + FormatInt(d[i]));
}
}
Thanks for any help with this. I'm sure it's simple but it's driving me nuts!
Jeremy
|
|
|
|
Re: String w/high characters but not UTF? [message #27911 is a reply to message #27907] |
Tue, 10 August 2010 16:01 |
|
koldo
Messages: 3355 Registered: August 2008
|
Senior Veteran |
|
|
cbpporter wrote on Tue, 10 August 2010 15:23 | There is nothing wrong with that program, related to UTF8 or otherwise. I behaves as it should. The problem is that you are inserting a large value like 182 in a signed char and the result gets interpreted as a negative number.
|
Yes.
For example compiling with MSC I got three warnings like this:
warning C4309: 'initializing' : truncation of constant value
for the 181, 182 and 183.
In addition String d does not know the length of char data[] as it is not ended with '\0'. This easily can produce an error.
Check this:
#include <Core/Core.h>
using namespace Upp;
CONSOLE_APP_MAIN
{
{
puts("Original");
char data[] = { 65, 65, 65, 181, 65, 65, 65, 182, 65, 65, 183 };
String d(data);
for (int i=0; i < d.GetCount(); i++)
puts(FormatInt(i) + "=" + FormatInt(d[i]));
}
{
puts("Changed");
byte data[] = { 65, 65, 65, 181, 65, 65, 65, 182, 65, 65, 183 };
String d(data, 11);
for (int i=0; i < d.GetCount(); i++)
puts(FormatInt(i) + "=" + FormatInt(byte(d[i])));
}
getchar();
}
The output is this:
Original
0=65
1=65
2=65
3=-75
4=65
5=65
6=65
7=-74
8=65
9=65
10=-73
Changed
0=65
1=65
2=65
3=181
4=65
5=65
6=65
7=182
8=65
9=65
10=183
byte type is a natural way in U++ to handle binary data.
If you need a classic C array with undefined length in compiling time you can also use:
Buffer<byte> data;
data.Alloc(dataLen);
instead of the usual and more dangerous malloc/free/new/delete.
Best regards
Iñaki
|
|
|
Re: String w/high characters but not UTF? [message #27912 is a reply to message #27868] |
Tue, 10 August 2010 16:21 |
jeremy_c
Messages: 175 Registered: August 2007 Location: Ohio, USA
|
Experienced Member |
|
|
The problem is that I am using HttpClient to get this data. So it actually looks like:
String data = HttpClient(...);
data then includes the positive and negative characters.
The original question was how to get this data correctly or to deal with it once it has been gotten incorrectly.
Jeremy
[Updated on: Tue, 10 August 2010 16:23] Report message to a moderator
|
|
|
|
|
|
Goto Forum:
Current Time: Fri Mar 29 10:40:36 CET 2024
Total time taken to generate the page: 0.02496 seconds
|