Very strange behavior - printf and strcmp ignore my input line in only one line

this is the code:

printf(" DEBUG:%s\n" ,array[7] ); printf("address of %s is %p (again %d)\n", array[7], array[7], strcmp("N\\A", array[7]) ); printf("5DEBUG collection:%s\n" ,array[7] ); 

this is the result:

DEBUG: N \ A

equals 0x7c0600 (-13 again)

5DEBUG Collection: N \ A

as you can see, in the second printf array [7] (you need to point to "N \ A") it disappeared.

I have no idea what's going on here ...

+4
source share
1 answer

You are reading a Windows file in unix. Windows and unix use different line terminators. Unix uses 0x0A, while Windows uses 0x0D, followed by 0x0A.

If you have a line in the file that ends with 0x0D 0x0A, unix will process 0x00A as the line terminator, but include 0x0D as part of the line.

You can see this in your strcmp , which returns -13 . Note that it does not return zero, which means that the strings are not equal. In fact, the difference is 13, the decimal value is 0x0D, which confirms that array[7] has 0x0D at the end.

Another indication of this is the odd print behavior you see. On unix, printing 0x0D causes the cursor to return to column 0 of the same row. Therefore, the second print command begins with printing

 address of N\A 

and then it encounters 0x0D, which moves the cursor back to column 0. The rest of the row thus overlaps the output, resulting in

  is 0x7c0600 (again -13) 

If you tried to debug the program in the debugger with a breakpoint in the code, you would notice that array[7] has 0x0D at the end.

Added

It was not mental debugging. It was pretty simple. Here step by step:

  • When you notice odd behavior, you should use a debugger to look at the line in array[7] . If you did this, you would see the final 0x0Dand the problem would be resolved in 5 seconds.
  • The next huge clue was that the result of strcmp not zero. This means that the string in array[7] not equal to "N\\A" , which is your next huge hint that you should use a debugger to look at the string in array[7] to see what it really is .
  • Without using debugging, I noticed that the difference between array[7] and "N\\A" should be that it is not visible, since the first line is printed in order. Control characters or spaces are listed here.
  • The fact that strcmp reported a difference of 13 indicates that the line in array[7] at the end has 0x0D: "N\\A" ends with \0 (the numeric value is zero), and a difference of 13 suggests that array[7] ends with 0x0D, since 0x0d hex = 13 decimal.
  • If you did not use the logic from step 4, you might stop thinking: "What characters would ruin the print?" You bring down the mental list. Space, tab (forces you to print several spaces), carriage return (returns the cursor to column 0), a new line (go to the next line), feed formatting (clears the screen) and escape (enters console control sequences). The one that matches the data is a carriage return whose ASCII code (surprise) is 13.
  • If you did not use the logic from steps 4 or 5, you could study the claim that the problem does not exist on Windows. Windows uses 0x0D 0x0A as its line terminator, while unix uses 0x0A. Optional 0x0D is a carriage return that returns the cursor to column 0, which again matches the proof.

Thus, there were four independent ways to arrive at a single diagnosis. (Five, if you think, “Look at the line in the debugger.”) Since they all agreed, it made a pretty sure conclusion. My actual analysis started from step 5, and then used the other steps to confirm the diagnosis.

+15
source

Source: https://habr.com/ru/post/1481286/


All Articles