Home » U++ Library support » U++ Core » Problem using Split on carriage return
Problem using Split on carriage return [message #10467] |
Tue, 10 July 2007 23:10 |
$mike{is_here}
Messages: 12 Registered: July 2007
|
Promising Member |
|
|
Hi,
I want to split a String s loaded from a text file at the carriage return, like this:
Vector<String> getLines(String s) { // new line at \n
Vector<String> vs;
vs = Split(s.Begin(), '\n', true); // split by lines
return vs;
}
From Perl I'm used to do something like:
@list = split /\n/, $many_lines_of_text;
which finds all '\n' in the string $many_lines_of_text and puts it into a list.
It looks like U++ treats the carriage return as two separate characters, i.e. '\r\n'. When I split in U++ at '\n' or when I split at '\r' the other character is left over, leading to trouble.
What I would have to do is:
vs = Split(s.Begin(), '\r\n', true); // split by lines
which obviously leads to trouble.
What's the correct way to do it in U++ ?
Thanks and kind regards,
Micha
|
|
|
Re: Problem using Split on carriage return [message #10469 is a reply to message #10467] |
Tue, 10 July 2007 23:26 |
|
mirek
Messages: 13975 Registered: November 2005
|
Ultimate Member |
|
|
Well, it is not U++, it is M$... Back in MSDOS days, they have decided that line are to be ended with "\r\n" and this stupid idea is still alive.
Standard C library solves this problem by intruducing "text" and "binary" streams (text streams remove/insert \r on read/write), but IMO that just adds to confusion.
U++ general take on the problem is to ignore all \r characters on input and insert \r before each \n on output - but both is to be handled by the code.
In your case you have many options. One simple is to filter out all \r out before Split:
int NoCr(int c) { return c == '\r' ? 0 : c; }
Split(Filter(s, NoCr), '\n')
As you seem to ignore empty lines, another simple solution (with a bit better performance) is to treat \r like \n (because it will just result in empty line, which is ignored anyway):
int CrOrLf(int c) { return c == '\r' || c == '\n' ? c : 0; }
Split(s, CrOrLf);
Mirek
[/code]
|
|
|
|
Goto Forum:
Current Time: Sat Apr 27 12:18:48 CEST 2024
Total time taken to generate the page: 0.02319 seconds
|