Overview
Examples
Screenshots
Comparisons
Applications
Download
Documentation
Tutorials
Bazaar
Status & Roadmap
FAQ
Authors & License
Forums
Funding Ultimate++
Search on this site
Search in forums












SourceForge.net Logo
Home » U++ Library support » U++ Core » LoadFile problem with accented chars
Re: LoadFile problem with accented chars [message #20002 is a reply to message #20000] Mon, 09 February 2009 08:12 Go to previous messageGo to previous message
mirek is currently offline  mirek
Messages: 14267
Registered: November 2005
Ultimate Member
koldo wrote on Sun, 08 February 2009 16:11

Hello luzr

It seems it is a matter of Notepad itself. If the file has 7 bits chars there is no problem, but after adding chars like á it seems that Notepad itself changes its charset.

Using this test program:
CONSOLE_APP_MAIN
{
	String data = LoadFile("C:\\test.txt");
	for (int i = 0; i < data.GetCount(); ++i) 
		puts(Format("%d: %d", i, data[i]));	
	getchar();
}

with test.txt with a simple "a-á", I initially get this output:

0: 97
1: 45
2: -31

but after saving and opening the file some times, I get this:

0: -1
1: -2
2: 97
3: 0
4: 45
5: 0
6: -31
7: 0

and yesterday I got other output... The answer is that Notepad adds a "BOM" to the file if it thinks it requires a bigger encoding.

BOM (Byte Order Mark, http://unicode.org/faq/utf_bom.html#BOM) is a signature of letters in the begining of files that shows its encoding. For example:

- EF BB BF means UTF-8
- FF FE means UTF-16, little-endian



Why do not interpret it yourself?

I suggest implementing these:

WString LoadBOMW(const Stream& s);
WString LoadFileBOMW(const char *path);
void    SaveBOMUtf8(const Stream& s, const WString& data);
bool    SaveFileBOMUtf8(const char *path, const WString& data);

String  LoadBOM(const Stream& s); // Default encoding, usually utf-8
String  LoadFileBOM(const char *path);
void    SaveBOMUtf8(const Stream& s, const String& data);
bool    SaveFileBOMUtf8(const char *path, const String& data);


I would be glad to add them to Core.

Mirek
 
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Previous Topic: Core package build flags
Next Topic: Hi! Performance question
Goto Forum:
  


Current Time: Sun Aug 17 02:03:08 CEST 2025

Total time taken to generate the page: 0.05686 seconds