Home » U++ Library support » U++ Core » LoadFile problem with accented chars
Re: LoadFile problem with accented chars [message #20002 is a reply to message #20000] |
Mon, 09 February 2009 08:12   |
 |
mirek
Messages: 14267 Registered: November 2005
|
Ultimate Member |
|
|
koldo wrote on Sun, 08 February 2009 16:11 | Hello luzr
It seems it is a matter of Notepad itself. If the file has 7 bits chars there is no problem, but after adding chars like á it seems that Notepad itself changes its charset.
Using this test program:
CONSOLE_APP_MAIN
{
String data = LoadFile("C:\\test.txt");
for (int i = 0; i < data.GetCount(); ++i)
puts(Format("%d: %d", i, data[i]));
getchar();
}
with test.txt with a simple "a-á", I initially get this output:
0: 97
1: 45
2: -31
but after saving and opening the file some times, I get this:
0: -1
1: -2
2: 97
3: 0
4: 45
5: 0
6: -31
7: 0
and yesterday I got other output... The answer is that Notepad adds a "BOM" to the file if it thinks it requires a bigger encoding.
BOM (Byte Order Mark, http://unicode.org/faq/utf_bom.html#BOM) is a signature of letters in the begining of files that shows its encoding. For example:
- EF BB BF means UTF-8
- FF FE means UTF-16, little-endian
|
Why do not interpret it yourself?
I suggest implementing these:
WString LoadBOMW(const Stream& s);
WString LoadFileBOMW(const char *path);
void SaveBOMUtf8(const Stream& s, const WString& data);
bool SaveFileBOMUtf8(const char *path, const WString& data);
String LoadBOM(const Stream& s); // Default encoding, usually utf-8
String LoadFileBOM(const char *path);
void SaveBOMUtf8(const Stream& s, const String& data);
bool SaveFileBOMUtf8(const char *path, const String& data);
I would be glad to add them to Core.
Mirek
|
|
|
 |
|
LoadFile problem with accented chars
By: koldo on Sat, 07 February 2009 22:27
|
 |
|
Re: LoadFile problem with accented chars
By: mirek on Sun, 08 February 2009 08:06
|
 |
|
Re: LoadFile problem with accented chars
By: koldo on Sun, 08 February 2009 22:11
|
 |
|
Re: LoadFile problem with accented chars
By: mirek on Mon, 09 February 2009 08:12
|
 |
|
Re: LoadFile problem with accented chars
By: koldo on Mon, 09 February 2009 08:47
|
 |
|
Re: LoadFile problem with accented chars
By: mirek on Mon, 09 February 2009 17:28
|
 |
|
Re: LoadFile problem with accented chars
By: koldo on Tue, 10 February 2009 09:23
|
 |
|
Re: LoadFile problem with accented chars
By: koldo on Wed, 11 February 2009 15:05
|
 |
|
Re: LoadFile problem with accented chars
|
 |
|
Re: LoadFile problem with accented chars
By: koldo on Wed, 11 February 2009 19:26
|
 |
|
Re: LoadFile problem with accented chars
By: koldo on Thu, 12 February 2009 01:13
|
 |
|
Re: LoadFile problem with accented chars
By: mirek on Thu, 12 February 2009 18:02
|
 |
|
Re: LoadFile problem with accented chars
By: koldo on Fri, 13 February 2009 09:50
|
 |
|
Re: LoadFile problem with accented chars
By: mirek on Fri, 13 February 2009 11:05
|
 |
|
Re: LoadFile problem with accented chars
By: koldo on Fri, 13 February 2009 19:09
|
 |
|
Re: LoadFile problem with accented chars
By: mirek on Sun, 15 February 2009 00:05
|
Goto Forum:
Current Time: Sun Aug 17 02:03:08 CEST 2025
Total time taken to generate the page: 0.05686 seconds
|