Fgets writes different strings from the same file in Linux and Windows

up vote
0
down vote

favorite

I've just come across an issue where I was jumping between valgrind in Linux and other testing in Windows cmd.

I'm reading a certain line from a file like this:

fgets(buf, MAX_LINE_LEN, f_input);

Of course, buf is the size MAX_LINE_LEN + 1, but I digress.

This is the output of

printf("String length: %u; Contents: ", strlen(buf));
for (usint i = 0; i < strlen(buf); i++)
 printf("%x ", buf[i]);
puts(";");

in Windows:

String length: 14; Contents: 41 6e 64 72 65 6a 20 50 6c 61 76 6b 61 a ;
String length: 22; Contents: 41 6e 6e 61 20 4d 61 72 69 61 20 43 69 63 6d 61 6e 63 6f 76 61 a ;
String length: 25; Contents: 4d 61 72 69 61 20 52 61 7a 75 73 6f 76 61 20 4d 61 72 74 61 6b 6f 76 61 a ;
String length: 24; Contents: 4d 69 6c 61 6e 20 52 61 73 74 69 73 6c 61 76 20 50 6f 6b 6f 6a 6e 79 a ;
String length: 21; Contents: 4d 69 6c 65 6e 61 20 53 65 64 6d 69 6b 72 61 73 6b 6f 76 61 a ;
String length: 15; Contents: 56 69 6e 63 65 6e 74 20 53 69 6b 75 6c 61 a ;
String length: 17; Contents: 56 69 6e 63 65 6e 74 20 76 61 6e 20 47 6f 67 68 a ;

and in Linux:

String length: 15; Contents: 41 6e 64 72 65 6a 20 50 6c 61 76 6b 61 d a ;
String length: 23; Contents: 41 6e 6e 61 20 4d 61 72 69 61 20 43 69 63 6d 61 6e 63 6f 76 61 d a ;
String length: 26; Contents: 4d 61 72 69 61 20 52 61 7a 75 73 6f 76 61 20 4d 61 72 74 61 6b 6f 76 61 d a ;
String length: 25; Contents: 4d 69 6c 61 6e 20 52 61 73 74 69 73 6c 61 76 20 50 6f 6b 6f 6a 6e 79 d a ;
String length: 22; Contents: 4d 69 6c 65 6e 61 20 53 65 64 6d 69 6b 72 61 73 6b 6f 76 61 d a ;
String length: 16; Contents: 56 69 6e 63 65 6e 74 20 53 69 6b 75 6c 61 d a ;
String length: 18; Contents: 56 69 6e 63 65 6e 74 20 76 61 6e 20 47 6f 67 68 d a ;

As you can see in Linux, there is another character before the NL, a Carriage Return. If anyone can explain this and save me the pain of adding ifdef statements for a Linux and Windows code, I'd appreciate it. I understand, that linux appends a Carriage Return after each line, but is this really the intended behaviour when it then gets read by fgets?

edited Nov 10 at 22:52

asked Nov 10 at 22:30

areuz

322211

2

CRLF vs NL line endings. Windows uses two characters, 'r' and 'n' at the end of a line; Unix uses just 'n'. And on Windows, the I/O system maps the CRLF to 'n' only on input, but Linux doesn't because 'r' is just another control character to Unix. ('r' typically maps to control-M or 0x0D; 'n' typically maps to control-J or 0x0A.)
– Jonathan Leffler
Nov 10 at 22:35

"Of course, buf is the size MAX_LINE_LEN + 1" Not needed: the naximum number of characters read into the buffer is one less than the size you specify, and the line is NUL-terminated. man7.org/linux/man-pages/man3/fgets.3p.html
– Tim
Nov 10 at 22:36

Guessing that not that Linux is adding a CR, but that the CR is in the file data, which to LInux looks like two separate characters, to Windows it's one line-ending character, not sure why fgets represents the way it does though. Can you check the actual file contents
– Rodney
Nov 10 at 22:38

@Tim Oh yeah, fgets reserves one byte for null, I guess that was a mistype on my part, buf is actually the size of MAX_LINE_LEN.
– areuz
Nov 10 at 22:39

add a comment |

up vote
0
down vote

favorite

I've just come across an issue where I was jumping between valgrind in Linux and other testing in Windows cmd.

I'm reading a certain line from a file like this:

fgets(buf, MAX_LINE_LEN, f_input);

Of course, buf is the size MAX_LINE_LEN + 1, but I digress.

This is the output of

printf("String length: %u; Contents: ", strlen(buf));
for (usint i = 0; i < strlen(buf); i++)
 printf("%x ", buf[i]);
puts(";");

in Windows:

String length: 14; Contents: 41 6e 64 72 65 6a 20 50 6c 61 76 6b 61 a ;
String length: 22; Contents: 41 6e 6e 61 20 4d 61 72 69 61 20 43 69 63 6d 61 6e 63 6f 76 61 a ;
String length: 25; Contents: 4d 61 72 69 61 20 52 61 7a 75 73 6f 76 61 20 4d 61 72 74 61 6b 6f 76 61 a ;
String length: 24; Contents: 4d 69 6c 61 6e 20 52 61 73 74 69 73 6c 61 76 20 50 6f 6b 6f 6a 6e 79 a ;
String length: 21; Contents: 4d 69 6c 65 6e 61 20 53 65 64 6d 69 6b 72 61 73 6b 6f 76 61 a ;
String length: 15; Contents: 56 69 6e 63 65 6e 74 20 53 69 6b 75 6c 61 a ;
String length: 17; Contents: 56 69 6e 63 65 6e 74 20 76 61 6e 20 47 6f 67 68 a ;

and in Linux:

String length: 15; Contents: 41 6e 64 72 65 6a 20 50 6c 61 76 6b 61 d a ;
String length: 23; Contents: 41 6e 6e 61 20 4d 61 72 69 61 20 43 69 63 6d 61 6e 63 6f 76 61 d a ;
String length: 26; Contents: 4d 61 72 69 61 20 52 61 7a 75 73 6f 76 61 20 4d 61 72 74 61 6b 6f 76 61 d a ;
String length: 25; Contents: 4d 69 6c 61 6e 20 52 61 73 74 69 73 6c 61 76 20 50 6f 6b 6f 6a 6e 79 d a ;
String length: 22; Contents: 4d 69 6c 65 6e 61 20 53 65 64 6d 69 6b 72 61 73 6b 6f 76 61 d a ;
String length: 16; Contents: 56 69 6e 63 65 6e 74 20 53 69 6b 75 6c 61 d a ;
String length: 18; Contents: 56 69 6e 63 65 6e 74 20 76 61 6e 20 47 6f 67 68 d a ;

edited Nov 10 at 22:52

asked Nov 10 at 22:30

areuz

322211

2

CRLF vs NL line endings. Windows uses two characters, 'r' and 'n' at the end of a line; Unix uses just 'n'. And on Windows, the I/O system maps the CRLF to 'n' only on input, but Linux doesn't because 'r' is just another control character to Unix. ('r' typically maps to control-M or 0x0D; 'n' typically maps to control-J or 0x0A.)
– Jonathan Leffler
Nov 10 at 22:35

"Of course, buf is the size MAX_LINE_LEN + 1" Not needed: the naximum number of characters read into the buffer is one less than the size you specify, and the line is NUL-terminated. man7.org/linux/man-pages/man3/fgets.3p.html
– Tim
Nov 10 at 22:36

Guessing that not that Linux is adding a CR, but that the CR is in the file data, which to LInux looks like two separate characters, to Windows it's one line-ending character, not sure why fgets represents the way it does though. Can you check the actual file contents
– Rodney
Nov 10 at 22:38

@Tim Oh yeah, fgets reserves one byte for null, I guess that was a mistype on my part, buf is actually the size of MAX_LINE_LEN.
– areuz
Nov 10 at 22:39

add a comment |

up vote
0
down vote

favorite

I've just come across an issue where I was jumping between valgrind in Linux and other testing in Windows cmd.

I'm reading a certain line from a file like this:

fgets(buf, MAX_LINE_LEN, f_input);

Of course, buf is the size MAX_LINE_LEN + 1, but I digress.

This is the output of

printf("String length: %u; Contents: ", strlen(buf));
for (usint i = 0; i < strlen(buf); i++)
 printf("%x ", buf[i]);
puts(";");

in Windows:

String length: 14; Contents: 41 6e 64 72 65 6a 20 50 6c 61 76 6b 61 a ;
String length: 22; Contents: 41 6e 6e 61 20 4d 61 72 69 61 20 43 69 63 6d 61 6e 63 6f 76 61 a ;
String length: 25; Contents: 4d 61 72 69 61 20 52 61 7a 75 73 6f 76 61 20 4d 61 72 74 61 6b 6f 76 61 a ;
String length: 24; Contents: 4d 69 6c 61 6e 20 52 61 73 74 69 73 6c 61 76 20 50 6f 6b 6f 6a 6e 79 a ;
String length: 21; Contents: 4d 69 6c 65 6e 61 20 53 65 64 6d 69 6b 72 61 73 6b 6f 76 61 a ;
String length: 15; Contents: 56 69 6e 63 65 6e 74 20 53 69 6b 75 6c 61 a ;
String length: 17; Contents: 56 69 6e 63 65 6e 74 20 76 61 6e 20 47 6f 67 68 a ;

and in Linux:

String length: 15; Contents: 41 6e 64 72 65 6a 20 50 6c 61 76 6b 61 d a ;
String length: 23; Contents: 41 6e 6e 61 20 4d 61 72 69 61 20 43 69 63 6d 61 6e 63 6f 76 61 d a ;
String length: 26; Contents: 4d 61 72 69 61 20 52 61 7a 75 73 6f 76 61 20 4d 61 72 74 61 6b 6f 76 61 d a ;
String length: 25; Contents: 4d 69 6c 61 6e 20 52 61 73 74 69 73 6c 61 76 20 50 6f 6b 6f 6a 6e 79 d a ;
String length: 22; Contents: 4d 69 6c 65 6e 61 20 53 65 64 6d 69 6b 72 61 73 6b 6f 76 61 d a ;
String length: 16; Contents: 56 69 6e 63 65 6e 74 20 53 69 6b 75 6c 61 d a ;
String length: 18; Contents: 56 69 6e 63 65 6e 74 20 76 61 6e 20 47 6f 67 68 d a ;

edited Nov 10 at 22:52

asked Nov 10 at 22:30

areuz

322211

I've just come across an issue where I was jumping between valgrind in Linux and other testing in Windows cmd.

I'm reading a certain line from a file like this:

fgets(buf, MAX_LINE_LEN, f_input);

Of course, buf is the size MAX_LINE_LEN + 1, but I digress.

This is the output of

printf("String length: %u; Contents: ", strlen(buf));
for (usint i = 0; i < strlen(buf); i++)
 printf("%x ", buf[i]);
puts(";");

in Windows:

String length: 14; Contents: 41 6e 64 72 65 6a 20 50 6c 61 76 6b 61 a ;
String length: 22; Contents: 41 6e 6e 61 20 4d 61 72 69 61 20 43 69 63 6d 61 6e 63 6f 76 61 a ;
String length: 25; Contents: 4d 61 72 69 61 20 52 61 7a 75 73 6f 76 61 20 4d 61 72 74 61 6b 6f 76 61 a ;
String length: 24; Contents: 4d 69 6c 61 6e 20 52 61 73 74 69 73 6c 61 76 20 50 6f 6b 6f 6a 6e 79 a ;
String length: 21; Contents: 4d 69 6c 65 6e 61 20 53 65 64 6d 69 6b 72 61 73 6b 6f 76 61 a ;
String length: 15; Contents: 56 69 6e 63 65 6e 74 20 53 69 6b 75 6c 61 a ;
String length: 17; Contents: 56 69 6e 63 65 6e 74 20 76 61 6e 20 47 6f 67 68 a ;

and in Linux:

String length: 15; Contents: 41 6e 64 72 65 6a 20 50 6c 61 76 6b 61 d a ;
String length: 23; Contents: 41 6e 6e 61 20 4d 61 72 69 61 20 43 69 63 6d 61 6e 63 6f 76 61 d a ;
String length: 26; Contents: 4d 61 72 69 61 20 52 61 7a 75 73 6f 76 61 20 4d 61 72 74 61 6b 6f 76 61 d a ;
String length: 25; Contents: 4d 69 6c 61 6e 20 52 61 73 74 69 73 6c 61 76 20 50 6f 6b 6f 6a 6e 79 d a ;
String length: 22; Contents: 4d 69 6c 65 6e 61 20 53 65 64 6d 69 6b 72 61 73 6b 6f 76 61 d a ;
String length: 16; Contents: 56 69 6e 63 65 6e 74 20 53 69 6b 75 6c 61 d a ;
String length: 18; Contents: 56 69 6e 63 65 6e 74 20 76 61 6e 20 47 6f 67 68 d a ;

c linux fgets

edited Nov 10 at 22:52

asked Nov 10 at 22:30

areuz

322211

edited Nov 10 at 22:52

asked Nov 10 at 22:30

areuz

322211

edited Nov 10 at 22:52

asked Nov 10 at 22:30

areuz

322211

asked Nov 10 at 22:30

areuz

322211

asked Nov 10 at 22:30

areuz

322211

2

CRLF vs NL line endings. Windows uses two characters, 'r' and 'n' at the end of a line; Unix uses just 'n'. And on Windows, the I/O system maps the CRLF to 'n' only on input, but Linux doesn't because 'r' is just another control character to Unix. ('r' typically maps to control-M or 0x0D; 'n' typically maps to control-J or 0x0A.)
– Jonathan Leffler
Nov 10 at 22:35

"Of course, buf is the size MAX_LINE_LEN + 1" Not needed: the naximum number of characters read into the buffer is one less than the size you specify, and the line is NUL-terminated. man7.org/linux/man-pages/man3/fgets.3p.html
– Tim
Nov 10 at 22:36

Guessing that not that Linux is adding a CR, but that the CR is in the file data, which to LInux looks like two separate characters, to Windows it's one line-ending character, not sure why fgets represents the way it does though. Can you check the actual file contents
– Rodney
Nov 10 at 22:38

@Tim Oh yeah, fgets reserves one byte for null, I guess that was a mistype on my part, buf is actually the size of MAX_LINE_LEN.
– areuz
Nov 10 at 22:39

add a comment |

2

CRLF vs NL line endings. Windows uses two characters, 'r' and 'n' at the end of a line; Unix uses just 'n'. And on Windows, the I/O system maps the CRLF to 'n' only on input, but Linux doesn't because 'r' is just another control character to Unix. ('r' typically maps to control-M or 0x0D; 'n' typically maps to control-J or 0x0A.)
– Jonathan Leffler
Nov 10 at 22:35

"Of course, buf is the size MAX_LINE_LEN + 1" Not needed: the naximum number of characters read into the buffer is one less than the size you specify, and the line is NUL-terminated. man7.org/linux/man-pages/man3/fgets.3p.html
– Tim
Nov 10 at 22:36

Guessing that not that Linux is adding a CR, but that the CR is in the file data, which to LInux looks like two separate characters, to Windows it's one line-ending character, not sure why fgets represents the way it does though. Can you check the actual file contents
– Rodney
Nov 10 at 22:38

@Tim Oh yeah, fgets reserves one byte for null, I guess that was a mistype on my part, buf is actually the size of MAX_LINE_LEN.
– areuz
Nov 10 at 22:39

CRLF vs NL line endings. Windows uses two characters, 'r' and 'n' at the end of a line; Unix uses just 'n'. And on Windows, the I/O system maps the CRLF to 'n' only on input, but Linux doesn't because 'r' is just another control character to Unix. ('r' typically maps to control-M or 0x0D; 'n' typically maps to control-J or 0x0A.)
– Jonathan Leffler
Nov 10 at 22:35

"Of course, buf is the size MAX_LINE_LEN + 1" Not needed: the naximum number of characters read into the buffer is one less than the size you specify, and the line is NUL-terminated. man7.org/linux/man-pages/man3/fgets.3p.html
– Tim
Nov 10 at 22:36

Guessing that not that Linux is adding a CR, but that the CR is in the file data, which to LInux looks like two separate characters, to Windows it's one line-ending character, not sure why fgets represents the way it does though. Can you check the actual file contents
– Rodney
Nov 10 at 22:38

@Tim Oh yeah, fgets reserves one byte for null, I guess that was a mistype on my part, buf is actually the size of MAX_LINE_LEN.
– areuz
Nov 10 at 22:39

add a comment |

3 Answers
3

active

oldest

votes

up vote
2
down vote

As you can see in Linux, there is another character before the NL, a Carriage Return.

That is because your files use CR+LF newlines, i.e. each newline is actually two characters: "rn".

If you open files without the "b" flag in Windows, its C library will convert each n you write to rn, and each rn you read to n.

Use the "b" fopen() flag in Windows to see the actual file contents.

When you read a line using fgets(buf, sizeof buf, handle), you can use buf[strcspn(buf, "rn")] = ''; to remove the newline.

answered Nov 10 at 22:39

Nominal Animal

27.9k33259

I like the use of the b flag the most, as it removes the difference between the two platforms. I later remove the newline while copying the string anyways, so this allows me to remove both characters and now works in both windows and linux. Thanks.
– areuz
Nov 10 at 22:47

1

@areuz: You can also use len = strcspn(buf, "rn"); to obtain the length of the line excluding the newline, instead of len = strlen(buf);, when copying. If you use a temporary char pointer char *p = fgets(buf, sizeof buf, f_input);, you can skip leading whitespace using p += strspn(p, "tnvfr "); and find the length of the rest of the line excluding newline using len = strcspn(p, "rn");. There is no strrcspn(), so to remove trailing whitespace, you need e.g. while (len > 1 && isspace(p[len-1])) len--;. Then, copy just len chars starting at p.
– Nominal Animal
Nov 10 at 22:53

add a comment |

up vote
2
down vote

MS and Linux has a different expectation of a text file line ending:"rn" vs "n".

To cope, recommend after fgets() use strcspn() to lop off the potential end of line sequence, be it "n", "rn" or missing.

fgets(buf, MAX_LINE_LEN, f_input);
buf[strcspn(buf, "nr")] = '';

Some compilers on Windows will use "n" as the end-of-line sequence and others use "rn". So I attribute the variation to compilers and their manufacturers more so than the OS. Also some old MAC text files end with 'r' and will foul fgets() on Linux.

Further: reading a file that has "rn" as a text file that expects "n" as the end-of-line sequence has a problem when reading a full buffer as "......r" and the line remainder as "n" on the next fgets(). Additional processing is needed to cope as is the case whenever the buffer is insufficient for a line of input.

Text files of one variation are often copied to the other platforms, so this is a not-so-rare occurrence.

Due to editing, some text files will have a mixture of line-ending-sequences.

Pedantic code will read the file as binary and process variant line endings itself without fgets(). Good luck.

edited Nov 10 at 23:32

answered Nov 10 at 22:38

chux

78.6k869144

add a comment |

up vote
1
down vote

In C you an open a file stream in text or binary mode. In binary mode, no translation takes place, and the input and output are the bytes in the file. In text mode, the C "newline" character is translated into what is common on the platform in question. One UNIX-like systems, this is a 0A byte, and on DOS-like systems this is a 0D byte followed by a 0A byte. There are other cases on other operating systems, listed here:

https://en.wikipedia.org/wiki/Newline

So that you don't have to cope with every different text format in every program, these all get translated into an n character as far as the C program sees in the default case (text mode). The input/output layer does the necessary translations for you.

When you use fopen() to open a file stream in C for reading or writing, you provide a "file mode" parameter - you're probably using it here as "r" to read a file, or "w" to write one. If you want to newline translation done you can specify that the stream is opened in binary mode, with "rb" for reading or "wb" for writing.

edited Nov 10 at 22:52

answered Nov 10 at 22:41

Tim

7,7912344

"you don't have to cope with every different text format in every program, these all get translated into an n character" is true when reading a text file native to that C program. The trick is when reading in text mode of a file that originated as some other system's text file.
– chux
Nov 10 at 23:40

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53244070%2ffgets-writes-different-strings-from-the-same-file-in-linux-and-windows%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

up vote
2
down vote

As you can see in Linux, there is another character before the NL, a Carriage Return.

That is because your files use CR+LF newlines, i.e. each newline is actually two characters: "rn".

If you open files without the "b" flag in Windows, its C library will convert each n you write to rn, and each rn you read to n.

Use the "b" fopen() flag in Windows to see the actual file contents.

When you read a line using fgets(buf, sizeof buf, handle), you can use buf[strcspn(buf, "rn")] = ''; to remove the newline.

answered Nov 10 at 22:39

Nominal Animal

27.9k33259

I like the use of the b flag the most, as it removes the difference between the two platforms. I later remove the newline while copying the string anyways, so this allows me to remove both characters and now works in both windows and linux. Thanks.
– areuz
Nov 10 at 22:47

1

@areuz: You can also use len = strcspn(buf, "rn"); to obtain the length of the line excluding the newline, instead of len = strlen(buf);, when copying. If you use a temporary char pointer char *p = fgets(buf, sizeof buf, f_input);, you can skip leading whitespace using p += strspn(p, "tnvfr "); and find the length of the rest of the line excluding newline using len = strcspn(p, "rn");. There is no strrcspn(), so to remove trailing whitespace, you need e.g. while (len > 1 && isspace(p[len-1])) len--;. Then, copy just len chars starting at p.
– Nominal Animal
Nov 10 at 22:53

add a comment |

up vote
2
down vote

As you can see in Linux, there is another character before the NL, a Carriage Return.

That is because your files use CR+LF newlines, i.e. each newline is actually two characters: "rn".

If you open files without the "b" flag in Windows, its C library will convert each n you write to rn, and each rn you read to n.

Use the "b" fopen() flag in Windows to see the actual file contents.

When you read a line using fgets(buf, sizeof buf, handle), you can use buf[strcspn(buf, "rn")] = ''; to remove the newline.

answered Nov 10 at 22:39

Nominal Animal

27.9k33259

I like the use of the b flag the most, as it removes the difference between the two platforms. I later remove the newline while copying the string anyways, so this allows me to remove both characters and now works in both windows and linux. Thanks.
– areuz
Nov 10 at 22:47

1

@areuz: You can also use len = strcspn(buf, "rn"); to obtain the length of the line excluding the newline, instead of len = strlen(buf);, when copying. If you use a temporary char pointer char *p = fgets(buf, sizeof buf, f_input);, you can skip leading whitespace using p += strspn(p, "tnvfr "); and find the length of the rest of the line excluding newline using len = strcspn(p, "rn");. There is no strrcspn(), so to remove trailing whitespace, you need e.g. while (len > 1 && isspace(p[len-1])) len--;. Then, copy just len chars starting at p.
– Nominal Animal
Nov 10 at 22:53

add a comment |

up vote
2
down vote

As you can see in Linux, there is another character before the NL, a Carriage Return.

That is because your files use CR+LF newlines, i.e. each newline is actually two characters: "rn".

If you open files without the "b" flag in Windows, its C library will convert each n you write to rn, and each rn you read to n.

Use the "b" fopen() flag in Windows to see the actual file contents.

When you read a line using fgets(buf, sizeof buf, handle), you can use buf[strcspn(buf, "rn")] = ''; to remove the newline.

answered Nov 10 at 22:39

Nominal Animal

27.9k33259

As you can see in Linux, there is another character before the NL, a Carriage Return.

That is because your files use CR+LF newlines, i.e. each newline is actually two characters: "rn".

If you open files without the "b" flag in Windows, its C library will convert each n you write to rn, and each rn you read to n.

Use the "b" fopen() flag in Windows to see the actual file contents.

When you read a line using fgets(buf, sizeof buf, handle), you can use buf[strcspn(buf, "rn")] = ''; to remove the newline.

answered Nov 10 at 22:39

Nominal Animal

27.9k33259

answered Nov 10 at 22:39

Nominal Animal

27.9k33259

answered Nov 10 at 22:39

Nominal Animal

27.9k33259

answered Nov 10 at 22:39

Nominal Animal

27.9k33259

I like the use of the b flag the most, as it removes the difference between the two platforms. I later remove the newline while copying the string anyways, so this allows me to remove both characters and now works in both windows and linux. Thanks.
– areuz
Nov 10 at 22:47

1

@areuz: You can also use len = strcspn(buf, "rn"); to obtain the length of the line excluding the newline, instead of len = strlen(buf);, when copying. If you use a temporary char pointer char *p = fgets(buf, sizeof buf, f_input);, you can skip leading whitespace using p += strspn(p, "tnvfr "); and find the length of the rest of the line excluding newline using len = strcspn(p, "rn");. There is no strrcspn(), so to remove trailing whitespace, you need e.g. while (len > 1 && isspace(p[len-1])) len--;. Then, copy just len chars starting at p.
– Nominal Animal
Nov 10 at 22:53

add a comment |

I like the use of the b flag the most, as it removes the difference between the two platforms. I later remove the newline while copying the string anyways, so this allows me to remove both characters and now works in both windows and linux. Thanks.
– areuz
Nov 10 at 22:47

1

@areuz: You can also use len = strcspn(buf, "rn"); to obtain the length of the line excluding the newline, instead of len = strlen(buf);, when copying. If you use a temporary char pointer char *p = fgets(buf, sizeof buf, f_input);, you can skip leading whitespace using p += strspn(p, "tnvfr "); and find the length of the rest of the line excluding newline using len = strcspn(p, "rn");. There is no strrcspn(), so to remove trailing whitespace, you need e.g. while (len > 1 && isspace(p[len-1])) len--;. Then, copy just len chars starting at p.
– Nominal Animal
Nov 10 at 22:53

I like the use of the b flag the most, as it removes the difference between the two platforms. I later remove the newline while copying the string anyways, so this allows me to remove both characters and now works in both windows and linux. Thanks.
– areuz
Nov 10 at 22:47

@areuz: You can also use len = strcspn(buf, "rn"); to obtain the length of the line excluding the newline, instead of len = strlen(buf);, when copying. If you use a temporary char pointer char *p = fgets(buf, sizeof buf, f_input);, you can skip leading whitespace using p += strspn(p, "tnvfr "); and find the length of the rest of the line excluding newline using len = strcspn(p, "rn");. There is no strrcspn(), so to remove trailing whitespace, you need e.g. while (len > 1 && isspace(p[len-1])) len--;. Then, copy just len chars starting at p.
– Nominal Animal
Nov 10 at 22:53

add a comment |

up vote
2
down vote

MS and Linux has a different expectation of a text file line ending:"rn" vs "n".

To cope, recommend after fgets() use strcspn() to lop off the potential end of line sequence, be it "n", "rn" or missing.

fgets(buf, MAX_LINE_LEN, f_input);
buf[strcspn(buf, "nr")] = '';

Text files of one variation are often copied to the other platforms, so this is a not-so-rare occurrence.

Due to editing, some text files will have a mixture of line-ending-sequences.

Pedantic code will read the file as binary and process variant line endings itself without fgets(). Good luck.

edited Nov 10 at 23:32

answered Nov 10 at 22:38

chux

78.6k869144

add a comment |

up vote
2
down vote

MS and Linux has a different expectation of a text file line ending:"rn" vs "n".

To cope, recommend after fgets() use strcspn() to lop off the potential end of line sequence, be it "n", "rn" or missing.

fgets(buf, MAX_LINE_LEN, f_input);
buf[strcspn(buf, "nr")] = '';

Text files of one variation are often copied to the other platforms, so this is a not-so-rare occurrence.

Due to editing, some text files will have a mixture of line-ending-sequences.

Pedantic code will read the file as binary and process variant line endings itself without fgets(). Good luck.

edited Nov 10 at 23:32

answered Nov 10 at 22:38

chux

78.6k869144

add a comment |

up vote
2
down vote

MS and Linux has a different expectation of a text file line ending:"rn" vs "n".

To cope, recommend after fgets() use strcspn() to lop off the potential end of line sequence, be it "n", "rn" or missing.

fgets(buf, MAX_LINE_LEN, f_input);
buf[strcspn(buf, "nr")] = '';

Text files of one variation are often copied to the other platforms, so this is a not-so-rare occurrence.

Due to editing, some text files will have a mixture of line-ending-sequences.

Pedantic code will read the file as binary and process variant line endings itself without fgets(). Good luck.

edited Nov 10 at 23:32

answered Nov 10 at 22:38

chux

78.6k869144

MS and Linux has a different expectation of a text file line ending:"rn" vs "n".

To cope, recommend after fgets() use strcspn() to lop off the potential end of line sequence, be it "n", "rn" or missing.

fgets(buf, MAX_LINE_LEN, f_input);
buf[strcspn(buf, "nr")] = '';

Text files of one variation are often copied to the other platforms, so this is a not-so-rare occurrence.

Due to editing, some text files will have a mixture of line-ending-sequences.

Pedantic code will read the file as binary and process variant line endings itself without fgets(). Good luck.

edited Nov 10 at 23:32

answered Nov 10 at 22:38

chux

78.6k869144

edited Nov 10 at 23:32

answered Nov 10 at 22:38

chux

78.6k869144

answered Nov 10 at 22:38

chux

78.6k869144

answered Nov 10 at 22:38

chux

78.6k869144

add a comment |

up vote
1
down vote

https://en.wikipedia.org/wiki/Newline

edited Nov 10 at 22:52

answered Nov 10 at 22:41

Tim

7,7912344

"you don't have to cope with every different text format in every program, these all get translated into an n character" is true when reading a text file native to that C program. The trick is when reading in text mode of a file that originated as some other system's text file.
– chux
Nov 10 at 23:40

add a comment |

up vote
1
down vote

https://en.wikipedia.org/wiki/Newline

edited Nov 10 at 22:52

answered Nov 10 at 22:41

Tim

7,7912344

"you don't have to cope with every different text format in every program, these all get translated into an n character" is true when reading a text file native to that C program. The trick is when reading in text mode of a file that originated as some other system's text file.
– chux
Nov 10 at 23:40

add a comment |

up vote
1
down vote

https://en.wikipedia.org/wiki/Newline

edited Nov 10 at 22:52

answered Nov 10 at 22:41

Tim

7,7912344

https://en.wikipedia.org/wiki/Newline

edited Nov 10 at 22:52

answered Nov 10 at 22:41

Tim

7,7912344

edited Nov 10 at 22:52

answered Nov 10 at 22:41

Tim

7,7912344

answered Nov 10 at 22:41

Tim

7,7912344

answered Nov 10 at 22:41

Tim

7,7912344

"you don't have to cope with every different text format in every program, these all get translated into an n character" is true when reading a text file native to that C program. The trick is when reading in text mode of a file that originated as some other system's text file.
– chux
Nov 10 at 23:40

add a comment |

"you don't have to cope with every different text format in every program, these all get translated into an n character" is true when reading a text file native to that C program. The trick is when reading in text mode of a file that originated as some other system's text file.
– chux
Nov 10 at 23:40

"you don't have to cope with every different text format in every program, these all get translated into an n character" is true when reading a text file native to that C program. The trick is when reading in text mode of a file that originated as some other system's text file.
– chux
Nov 10 at 23:40

add a comment |

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

z6B KAQRRn zeTrjgThUnjDyw3QgUrwfrLA ZsRsSPlR5dtEa SwK07LhT9aP,pR2M4zW6pgaCwijivNit,OS

搜尋此網誌

Myujth