Scanf% d segfault with large input

Question

Scanf% d segfault with large input

So, I ran some static code analyzer over some c code, and one thing that surprised me was a warning about:

int val; scanf("%d", &val);

which said that for large enough input, this could lead to segfault. And of course, this can happen. Now the fix is simple enough (specify some width, because we know how many places a real integer can have, depending on the architecture), but I wonder why this happens in the first place and why it doesn't look like an error in libc (and simple at that)?

Now I assume that some of the reasons for this behavior are primarily absent?

Edit: Well, since the question doesn’t seem so clear, there’s a bit more explanation: No code analyzer warns about scanf in general, but about scanf reading a digit without the width specified in a particular one.

So here is a minimal working example:

 #include <stdlib.h> #include <stdio.h> int main() { int val; scanf("%d", &val); printf("Number not large enough.\n"); return 0; }

We can get segfault by sending a giant number (using, for example, Python):

 import subprocess cmd = "./test" p = subprocess.Popen(cmd, stdin=subprocess.PIPE, shell=True) p.communicate("9"*50000000000000) # program will segfault, if not make number larger

+6

c scanf

Voo Jul 02 '11 at 2:25

source share

3 answers

edited since I missed the fact that you are feeding a static code analyzer with it

If the format %d matches the size of int , then the overflow should not be that it is written to val via a pointer, since it should always be int . Try passing a pointer to a long int and see if the analyzer gives a warning. Try changing %d to %ld by keeping the long int pointer and see if the warning is again indicated.

I believe the standards should say something about %d , the type it needs. Maybe the analyzer is worried that on some system int may be less than %d means? That sounds weird to me.

Running your gcc compiled example (and I have python 2.6.6) I get

 Traceback (most recent call last): File "./feed.py", line 4, in <module> p.communicate("9"*50000000000000) OverflowError: cannot fit 'long' into an index-sized integer Number not large enough.

Then I tried to run this instead:

 perl -e 'print "1"x6000000000000000;' |./test

and modified part C to write

 printf("%d Number not large enough.\n", val);

I get as output

 5513204 Number not large enough.

where the number changes each time it starts ... never segfault ... the GNU scanf implementation is safe ... although the resulting number is wrong ...

+2

Shintakezou Jul 02 '11 at 8:40

source share

The first step in processing the whole is to select a sequence of numbers. If this sequence is longer than expected, it can overflow a buffer of a fixed length, which will lead to a segmentation error.

You can achieve a similar effect with doubling. Being pushed to extremes, you can write 1, then a thousand zeros, and the indicator is -1000 (net - 1). In fact, when I tested this a few years ago, Solaris handled 1000 digits aplomb; it was a little less than 1024 that he ran into difficulties.

So, there is a QoI element - quality of implementation. There is also the element “follow the C standard,” scanf() cannot stop reading before it encounters a non-digit. ”These are conflicting goals.

+1

Jonathan leffler Jul 02 '11 at 4:48

source share

Philip craig · Accepted Answer · 2011-09-20T00:54:25+0000

If the static analyzer is cppcheck, then it warns about this because of an error in glibc, which has since been fixed: http://sources.redhat.com/bugzilla/show_bug.cgi?id=13138

Scanf% d segfault with large input

More articles: