Why is certain block closure optimization good and valid?

In a very interesting post from 2001, Allen Wirfs-Brock explains how to implement block blocking without re-creating a (own) stack.

Of the many ideas that he reveals, there is one that I do not quite understand, and I thought it would be nice to ask him here. He says:

Any variable that can never be assigned during the entire life cycle of a block (for example, arguments of surrounding methods and blocks) should not be placed in the environment if instead a copy of the variable is placed in close when it is created

There are two things that I'm not sure I understand well:

  • Why is using two copies of a read-only variable faster than moving the variable to the environment? Is this because for the encompassing context it will be faster to access the (source) variable on the stack?
  • How can we guarantee that two variables remain synchronized?

Question 1 must have a different reason. Otherwise, I do not see a gain (compared to the cost of implementing the optimization.)

For Question 2, take not the argument assigned in the method, not the block. Why does the pack stored on the stack remain unchanged during the life of the block?

I think I know the answer to Q2: since the execution of the block cannot be intertwined with the execution of the method, that is, while the block lives, the environment does not start. But is there no way to change the temporary stack while the block is alive?

+6
source share
1 answer

Thanks to the comment by @ aka.nice, I found answers to two questions in a post by Clement Behr, the reading of which is pleasant and clear.

For Q1, let's first say that Allenโ€™s remark means that a copy of the read-only variable can be pushed onto the block stack, as if it were a local temporary block action. The advantage of this only materializes if all variables defined outside the block and used inside it are never written to the block. Under these conditions, there is no need to create an array of surroundings and emit any prologs or epilogues to take care of this.

The machine code that accesses the stack variable is equivalent to the one required to access the environment, because the former will access the location using [ebp + offset] , and the latter will use [edi + offest] after edi is set to indicate the environment array ( tempVector in Clement's notation.) Thus, there is no benefit if some, but not all, environment variables are read-only.

The second question was also answered on Clement's wonderful blog. Yes, there is another way to break the synchronization between the original variable and its copy in the block stack: debugger (as aka.nice would say!) If the programmer changes the variable in an encompassing context, a debugger will be needed to detect the action and update the copy. The same thing if the programmer modifies the copy stored on the block stack.

I am glad that I decided to post this question here. The help I received from aka.nice and Clement Bera, as well as the comments of some people that I emailed, helped me greatly improve my understanding.

Last remark. Wirfs-Brock argues that avoiding re-evaluating method contexts is a must. I tend to agree. However, many important operations on these data structures can be better implemented if materialization follows a light drawing. More precisely, when debugging, you can model these contexts using "viewers" that point to their own stack and use two indexes to distinguish the part corresponding to the activation analysis. It is both effective and pure, and a combination of both methods leads to the best of the worlds, because you can have speed and expressiveness right away. Smalltalk is amazing.

+3
source

Source: https://habr.com/ru/post/981369/


All Articles