Why is this random string generator so bad?

I found this bit of PHP code to generate random strings (alphabetic, alphanumeric, numeric and hexadecimal).

<?php
function random($length = 8, $seeds = 'alpha') {
  // Possible seeds
  $seedings['alpha'] = 'abcdefghijklmnopqrstuvwqyz';
  $seedings['numeric'] = '0123456789';
  $seedings['alphanum'] = 'abcdefghijklmnopqrstuvwqyz0123456789';
  $seedings['hexidec'] = '0123456789abcdef';

  // Choose seed
  if (isset($seedings[$seeds])) {
    $seeds = $seedings[$seeds];
  }

  // Seed generator
  list($usec, $sec) = explode(' ', microtime());
  $seed = (float) $sec + ((float) $usec * 100000);
  mt_srand($seed);

  // Generate
  $str = '';
  $seeds_count = strlen($seeds);

  for ($i = 0; $length > $i; $i++) {
    $str .= $seeds{mt_rand(0, $seeds_count - 1)};
  }

  return $str;
}
?>

If I run this function with default arguments (so it generates 8 lines of characters, only in alphabetical order) and generates 1,000,000 lines, I think my collision speed will be low:

26^8 = 208,827,064,576
1,000,000 / 208,827,064,576 ~= 0.0004%

In fact, when I run this on my machine, I get a 90% collision rate! Only 10% of my generated lines are unique.

Actually, this is suspiciously close to 10%. By creating multiple sets of 1,000,000 random strings, I find that each set generates ...

  • 100,032 unique lines
  • 100,035 unique lines
  • 100,032 unique lines
  • 100 028
  • 100,030
  • ...

, ? , , mt_srand, php mt_rand, - .

...

?

?

+4
1

, , , :

: srand() mt_srand(), .

100%

<?php
  function random($length = 8, $charset = 'alpha'){
    $list = [
      'alpha' => 'abcdefghijklmnopqrstuvwqyz',
      'numeric' => '0123456789',
      'alphanum' => 'abcdefghijklmnopqrstuvwqyz0123456789',
      'hexidec' => '0123456789abcdef'
    ];

    if(!isset($list[$charset])){
      trigger_error("Invalid charset '$charset', allowed sets: '".implode(', ', array_keys($list))."'", E_USER_NOTICE);
      $charset = 'alpha';
    }

    $str   = '';
    $max   = strlen($list[$charset]) - 1;

    for ($i = 0; $length > $i; $i++) {
      $str .= $list[$charset][mt_rand(0, $max)];
    }

    return $str;
  }

  $loop = 1000000;

  for($i=0;$i<$loop;$i++){
    $arr[random()] = true;
  }

  echo $loop - count($arr), " dupes found in list.";
?>
+3

Source: https://habr.com/ru/post/1617333/


All Articles