Is ScalaCheck Gen.pick really random?

I noticed the following unexpected behavior when using ScalaCheck Gen.pic , which (for me) indicates that its collection is not quite random, even though its documentation says so:

/** A generator that picks a given number of elements from a list, randomly */

I executed the following three small programs in order (within 2 days, at different times, as it may be important) after installation

implicit override val generatorDrivenConfig = PropertyCheckConfig(
  maxSize = 1000, 
  minSize = 1000, 
  minSuccessful = 1000)

to get a decent sample size.

Program number 1

val set = Set(1,2,3,4,5,6,7,8,9,10,
      11,12,13,14,15,16,17,18,19,20,
      21,22,23,24,25,26,27,28,29,30,
      31,32,33,34,35,36,37,38,39,40,
      41,42,43,44,45,46,47,48,49,50)

// Thanks to @Jubobs for the solution
// See: http://stackoverflow.com/a/43825913/4169924
val g = Gen.pick(3, set).map { _.toList }
forAll (g) { s => println(s) }

From 3,000 numbers generated in 2 different runs, I got a surprisingly similar and rather nonrandom distribution (the numbers are rounded, only 5 are listed), as for the entire listing here):

  • Number : startup frequency # 1, startup frequency # 2
  • 15 : 33%, 33%
  • 47: 22%, 22%
  • 4: 15%, 16%
  • 19: 10%, 10%
  • 30: 6%, 6%

( : , , )

2

val list: List[Int] = List.range(1, 50)
val g = Gen.pick(3, list)
forAll (g) { s => println(s) }

List , , "" (3x1000 ):

  • 49: 33%, 33%
  • 48: 22%, 22%
  • 47: 14%, 14%
  • 46: 10%, 10%
  • 45: 6%, 6%

, , 1.

. 10 +/- 1% , "".

3

, , Set ( 1) List ( 2):

val set: Set[Int] = List.range(1, 50).toSet
val g = Gen.pick(3, set).map { _.toList }
forAll (g) { s => println(s) }

, 2 (List !), ( , 3 * 1000 2 ) :

  • 49: 33%, 33%
  • 48: 23%, 22%
  • 47: 16%, 15%
  • 46: 9%, 10%
  • 45: 7%, 6%

, ( ), , Gen.pick ( - , , "" ), "", .

Gen.pick , # 672 seed0:

def pick[T](n: Int, l: Iterable[T]): Gen[Seq[T]] = {
    if (n > l.size || n < 0) throw new IllegalArgumentException(s"invalid choice: $n")
    else if (n == 0) Gen.const(Nil)
    else gen { (p, seed0) =>
    // ...

( Gen.scala scala.util.Random), , - . Gen.pick? , "" ?

+6
2

@ashawley , , . , , erik-stripe commit 1 2016 ,

      val i = (x & 0x7fffffff).toInt % n

      val i = (x & 0x7fffffff).toInt % count

.

, 33% 100%, , 3 , 3. , 3 100% , - 66,6% .., , .

:

else gen { (p, seed0) =>
  val buf = ArrayBuffer.empty[T]
  val it = l.iterator
  var seed = seed0
  var count = 0
  while (it.hasNext) {
    val t = it.next
    count += 1
    if (count <= n) {
      buf += t
    } else {
      val (x, s) = seed.long
      val i = (x & 0x7fffffff).toInt % n
      if (i < n) buf(i) = t
      seed = s
    }
  }
  r(Some(buf), seed)
}

, ? if (count <= n) buf n, else. , while moving if :

  for (i <- 0 until  n) {
    val t = it.next
    buf += t
  }
  while (it.hasNext) {
    val t = it.next
    val (x, s) = seed.long
    val i = (x & 0x7fffffff).toInt % n
    if (i < n) buf(i) = t
    seed = s
  }

, , else , buf . , , if (i < n) , i something % n. .

, , Fisher-Yates shuffle, n , [0, count), , , , , .. counter while.

% count , , count 2. ,

    val c0 = choose(0, count-1)
    val rt: R[Int] = c0.doApply(p, seed)        
    seed = rt.seed      
    val i = rt.retrieve.get // index to swap current element with. Should be fair random number in range [0, count-1], see Fisher–Yates shuffle
    if (i < n) buf(i) = t

- i .

( % count )

java.util.Random.nextInt(int) org.scalacheck.Choose.chLng , . , % count, . , . , 3- i.e [0, 7], ranadom [0, 2], ,

srcGenerator.nextInt() % 3

[0, 7] [0, 2]:

  • 0, 3, 6 0 (.. 3 )
  • 1, 4, 7 1 (.. 3 )
  • 2, 5 2 (.. 2 )

, % 3, 0 - 3/8, 1 - 3/8, 2 - 2/8, , , . , , - , -. unifrom.

+5

, - . Sculacheck heurisitic.

. , . :

while (it.hasNext) {
  val t = it.next
  count += 1
  if (count <= n) {
    buf += t
  } else {
    val (x, s) = seed.long
    val i = (x & 0x7fffffff).toInt % n
    if (i < n) buf(i) = t
    seed = s
  }
  ...

else -, .

, pick . , , .

, , , , .

, reverse shuffle, pick.

Scalacheck , , , .

Update

, , , O (n) . . - . 1 kth (count), .

val i = (x & 0x7fffffff).toInt % n

:

val i = (x & 0x7fffffff).toInt % count

PR Scalacheck:

https://github.com/rickynils/scalacheck/pull/333

+1

Source: https://habr.com/ru/post/1017071/


All Articles