The /dev/urandom pseudo device along with dd can do this for you:
dd if=/dev/urandom of=newfile bs=1M count=10
This will create a newfile size 10M.
The device /dev/random often blocked, if not enough randomness is created, urandom will not be blocked. If you use randomness for crypto class things, you can avoid urandom . For something else, it should be enough and most likely faster.
If you want to mess up just the bits of your file (not the whole file), you can just use random C-style functions. Just use rnd() to calculate the offset and length n , then use it n times to capture random bytes to overwrite the file.
The following Perl script shows how this can be done (without worrying about compiling C code):
use strict; use warnings; sub corrupt ($$$$) { # Get parameters, names should be self-explanatory. my $filespec = shift; my $mincount = shift; my $maxcount = shift; my $charset = shift; # Work out position and size of corruption. my @fstat = stat ($filespec); my $size = $fstat[7]; my $count = $mincount + int (rand ($maxcount + 1 - $mincount)); my $pos = 0; if ($count >= $size) { $count = $size; } else { $pos = int (rand ($size - $count)); }
# Open file, seek to position, corrupt and close. open (my $fh, "+<$filespec") || die "Can't open $filespec: $!"; seek ($fh, $pos, 0); while ($count-- > 0) { my $newval = substr ($charset, int (rand (length ($charset) + 1)), 1); print $fh $newval; } close ($fh); } # Test harness. system ("echo =========="); #DEBUG system ("cp base-testfile testfile"); #DEBUG system ("cat testfile"); #DEBUG system ("echo =========="); #DEBUG corrupt ("testfile", 8, 16, "ABCDEFGHIJKLMNOPQRSTUVWXYZ "); system ("echo =========="); #DEBUG system ("cat testfile"); #DEBUG system ("echo =========="); #DEBUG
It consists of the corrupt function that you call with the file name, minimum and maximum size of corruption, and the character set to extract. The bit below is just a unit testing code. The following is an example of output in which you can see that the file section is damaged:
========== this is a file with nothing in it except for lowercase letters (and spaces and punctuation and newlines). that will make it easy to detect corruptions from the test program since the character range there is from uppercase a through z. i have to make it big enough so that the random stuff will work nicely, which is why i am waffling on a bit. ========== 'testfile', 344 bytes, corrupting 122 through 135 ========== this is a file with nothing in it except for lowercase letters (and spaces and punctuation and newlines). that will make iFHCGZF VJ GZDYct corruptions from the test program since the character range there is from uppercase a through z. i have to make it big enough so that the random stuff will work nicely, which is why i am waffling on a bit. ==========
It is tested at a basic level, but you may find that there are cases of errors in the province that you need to take care of. Do what you want with this.