I have a multi-line ASCII string coming from some (Windows/UNIX/...) system. Now, I know about differences in newline character in Windows and UNIX (CR-LF / LF) and I want to parse this string on both (CR and LF) characters to detect which newline character(s) is used in this string, so I need to know what "\n" in VS6 C++ means.
My question is if I write a peace of code in Visual Studio 6 for Windows:
bool FindNewline (string & inputString) {
size_t found;
found = inputString.find ("\n");
return开发者_如何学Go (found != string::npos ? true : false);
}
does this searches for CR+LF or only LF? Should I put "\r\n" or compiler interprets "\n" like CR+LF?
inputString.find ("\n");
will search for the LF character (alone).
Library routines may 'translate' between CR/LF and '\n' when I/O is performed on a text stream, but inside the realm of your program code, '\n' is just a line-feed.
"\n" means "\n". Nothing else. So you search for LF only. However Microsoft CRT does some conversions for you when you read a file in text mode, so you can write simpler code, sometimes.
All translation between "\n" and "\r\n" happens during I/O. At all other times, "\n" is just that and nothing more.
Somehow: return (found != string::npos ? true : false);
reminds me of another answer I wrote a while back.
Apart from the VS6 part (you really, really want to upgrade this, the compiler is way out of date and Microsoft doesn't really support it anymore), the answer to the question depends on how you are getting the string.
For example, if you read it from a file in text mode, the runtime library will translate \r\n
into \n. So if all your text strings are read in text mode via the usual file-based APIs, your search for
\n` (ie, newline only) would be sufficient.
If the strings originate in files that are read in binary mode on Windows and are known to contain the DOS/Windows line separator \r\n
, the you're better off searching for that character sequence.
EDIT: If you do get it in binary form, yes, ideally you'd have to check for both \r\n
and \n
. However I would expect that they aren't mixed within one string and still carry the same meaning unless it's a really messed up data format. I would probably check for \r\n
first and then \n
second if the strings are short enough and scanning them twice doesn't make that much of a difference. If it does, I'd write some code that checks for both \r\n
and single \n
in a single pass.
精彩评论