Scala Tape and tail flow synchronization

In one of his videos (regarding Scala's lazy rating, namely the lazy keyword), Martin Odersky shows the following implementation of the cons operation used to build the Stream :

 def cons[T](hd: T, tl: => Stream[T]) = new Stream[T] { def head = hd lazy val tail = tl ... } 

The tail operation is written in a compressed form using the lazy language evaluation function.

But actually (in Scala 2.11.7), the tail implementation is a little less elegant:

 @volatile private[this] var tlVal: Stream[A] = _ @volatile private[this] var tlGen = tl _ def tailDefined: Boolean = tlGen eq null override def tail: Stream[A] = { if (!tailDefined) synchronized { if (!tailDefined) { tlVal = tlGen() tlGen = null } } tlVal } 

Double-checked locks and two mutable fields: this is about how you could implement thread-lazy computing in Java.

So the questions are :

  • Does the lazy Scala keyword have a "rated maximum once" guarantee in a multi-threaded case?
  • Is the pattern used in the actual tail implementation an idiomatic way to perform thread-safe lazy evaluations in Scala?
+5
source share
3 answers

Doesn't the lazy Scala keyword provide any 'rated at most once' guarantees in a multi-threaded package?

Yes, it is, as others have stated.

Is the pattern used in real tail implementation an idiomatic way for thread safe lazy evaluation in Scala?

Edit:

I think I have a real answer, why not lazy val . Stream has public API methods like hasDefinitionSize inherited from TraversableOnce . To find out if a Stream has a finite size of not, we need a way to check without materializing the base Stream tail. Since lazy val does not actually reveal the base bit, we cannot do this.

It is supported by SI-1220

To reinforce this point, @ Jasper-M points out that the new LazyList api in strawman (Scala 2.13 collection makeover) no longer has this problem, since the entire collection hierarchy has been redesigned and there are no more such problems.


Performance Issues

I would say "it depends" on which corner you are looking at this problem. From a LOB perspective, I would definitely say go with lazy val for brevity and clarity of implementation. But, if you look at it from the point of view of the author of the Scala collection library, everything starts to look different. Think of it this way, you are creating a library that could potentially be used by many people and run on many machines around the world. This means that you should think about the memory overhead of each structure, especially if you yourself are creating such an important data structure.

I say this because when you use lazy val , by design, you create an extra Boolean field that indicates if the value has been initialized, and I assume that this is what the library authors sought to avoid. The size of a Boolean on the JVM, of course, depends on the VM, even bytes are what you need to consider, especially when people generate large Stream data. Again, this is definitely not what I usually consider and certainly represents a micro-optimization for memory usage.

The reason I think performance is one of the key points is the SI-7266 , which captures a memory leak in the stream. Note how important it is to keep track of the byte code to make sure that there are no extra values ​​left inside the generated class.

The difference in implementation is that the initialization of the definition of tail is the implementation of a method that checks the generator:

 def tailDefined: Boolean = tlGen eq null 

Instead of a field in a class.

+4
source
Values

Scala lazy evaluated only once in multi-threaded cases. This is because the lazy member score is actually wrapped in a synchronized block in the generated code.

Let's look at simple classes,

 class LazyTest { lazy val x = 5 } 

Now let's compile this with scalac,

 scalac -Xprint:all LazyTest.scala 

This will cause

 package <empty> { class LazyTest extends Object { final <synthetic> lazy private[this] var x: Int = _; @volatile private[this] var bitmap$0: Boolean = _; private def x$lzycompute(): Int = { LazyTest.this.synchronized(if (LazyTest.this.bitmap$0.unary_!()) { LazyTest.this.x = (5: Int); LazyTest.this.bitmap$0 = true }); LazyTest.this.x }; <stable> <accessor> lazy def x(): Int = if (LazyTest.this.bitmap$0.unary_!()) LazyTest.this.x$lzycompute() else LazyTest.this.x; def <init>(): LazyTest = { LazyTest.super.<init>(); () } } } 

You should be able to see ... that lazy rating is thread safe. And you will also see some similarities with this “less elegant” implementation in Scala 2.11.7

You can also experiment with tests similar to the following,

 import scala.concurrent.Future import scala.concurrent.ExecutionContext.Implicits.global case class A(i: Int) { lazy val j = { println("calculating j") i + 1 } } def checkLazyInMultiThread(): Unit = { val a = A(6) val futuresList = Range(1, 20).toList.map(i => Future{ println(s"Future $i :: ${aj}") }) Future.sequence(futuresList).onComplete(_ => println("completed")) } checkLazyInMultiThread() 

Now implementation in the standard library avoids the use of lazy , as they can provide a more efficient solution than this generic lazy translation.

+2
source
  • You are right, lazy val use blocking precisely for protection against double assessment at the same time access by two flows. Further developments, in addition, will provide the same guarantees without blocking.
  • What is idiomatic, in my humble opinion, is a highly controversial issue when it comes to a language that, by design, allows for a wide range of different idioms to be adopted. In general, however, application code is generally considered idiomatic when it moves toward pure functional programming, since it provides a number of interesting advantages in terms of ease of testing and reasoning, which would make sense to refuse only in case of serious problems. This problem may be one of the characteristics, therefore the current implementation of the Scala Collection API while simultaneously exposing the functional interface in most cases makes use of (within and in limited areas) var s, while loops and installed patterns from imperative programming (like the one you highlighted in your question).
+1
source

Source: https://habr.com/ru/post/1274523/


All Articles