Disclaimer: I am not an expert on PHP internal documents (yet?), So all of this is from my understanding and is not guaranteed to be 100% correct or complete. :)
So, firstly, the behavior of PHP 7, which, as I note, is also accompanied by HHVM, seems to be correct, and PHP 5 has an error here. There should be no additional purpose for the reference behavior, because regardless of the order of execution, the result of two calls ++$i should never be the same.
Operation codes look great; We decide we have two temporary variables $2 and $3 to hold the two results of the increment. But for some reason, PHP 5 acts as if we wrote this:
$i = 2; $i++; $temp1 =& $i; $i++; $temp2 =& $i; echo $temp1 + $temp2;
Instead of this:
$i = 2; $i++; $temp1 = $i; $i++; $temp2 = $i; echo $temp1 + $temp2;
Edit: The PHP Internals mailing list states that using multiple operations that change a variable within a single statement is usually considered "w90> behavior", and ++ is used as an example of this in C / C ++ .
As such, it is reasonable for PHP 5 to return the value that it does for implementation / optimization purposes, even if it is logically incompatible with sequential serialization into multiple statements.
(relatively new) The PHP language specification contains similar languages and examples:
Unless explicitly stated in this specification, the order in which the operands in an expression are evaluated relative to each other is not defined. [...] (For example, [...] in the full expression $j = $i + $i++ , whether the value of $i old or new $i unspecified.)
Perhaps this is a weaker statement than the behavior of "undefined", as it is understood that they are evaluated in a certain order, but now we are collecting nit-picking.
Phpdbg Research (PHP 5)
I was curious and I want to learn more about the internal components, as well as some games using phpdbg .
No links
Running the code $j = $i instead of $j =& $i , we start with two variables sharing the address, with a refcount of 2 (but not the is_ref flags):
Address Refs Type Variable 0x7f3272a83be8 2 (integer) $i 0x7f3272a83be8 2 (integer) $j
But as soon as you pre-increment, zvals are split, and only one temp var is shared with $ i, giving refcount 2:
Address Refs Type Variable 0x7f189f9ecfc8 2 (integer) $i 0x7f189f859be8 1 (integer) $j
When assigning a task
When the variables are connected together, they share the address with refcount 2 and the by-ref token:
Address Refs Type Variable 0x7f9e04ee7fd0 2 (integer) &$i 0x7f9e04ee7fd0 2 (integer) &$j
After the preliminary increments (but before the addition), the same address has a refcount of 4, showing 2 temp vars mistakenly linked by the link:
Address Refs Type Variable 0x7f9e04ee7fd0 4 (integer) &$i 0x7f9e04ee7fd0 4 (integer) &$j
Source of problem
Digging in the source at http://lxr.php.net , we can find the implementation of the operation code ZEND_PRE_INC :
PHP 5
The most important feature is the following:
SEPARATE_ZVAL_IF_NOT_REF(var_ptr);
So, we create a new zval for the result value only if it is not a reference. Further we have the following:
if (RETURN_VALUE_USED(opline)) { PZVAL_LOCK(*var_ptr); EX_T(opline->result.var).var.ptr = *var_ptr; }
So, if the actual value of the decrement return value is used, we need to “block” zval, which after a number of macros basically means “increase its refcount” before assigning it as a result.
If we created a new zval earlier, that’s good - our refcount is now 2, 1 for the actual variable, plus 1 for the result of the operation. But if we decided not to do this because we needed to draw a link, we simply increase the existing link counter and point to zval, which can be changed again.
PHP 7
So what has changed in PHP 7? A few things!
Firstly, phpdbg's output is pretty boring, since integers no longer reference PHP 7; instead, the reference assignment creates an additional pointer, which itself has refcount 1, to the same address in memory, which is the actual integer. The phpdbg output is as follows:
Address Refs Type Variable 0x7f175ca660e8 1 integer &$i int (2) 0x7f175ca660e8 1 integer &$j int (2)
Secondly, there is a special code path to the source for integers:
if (EXPECTED(Z_TYPE_P(var_ptr) == IS_LONG)) { fast_long_increment_function(var_ptr); if (UNEXPECTED(RETURN_VALUE_USED(opline))) { ZVAL_COPY_VALUE(EX_VAR(opline->result.var), var_ptr); } ZEND_VM_NEXT_OPCODE(); }
So, if the variable is an integer ( IS_LONG ), and not a reference to an integer ( IS_REFERENCE ), we can simply increase it in place. If we need a return value, we can copy its value into the result ( ZVAL_COPY_VALUE ).
If this is a link, we will not hit on this code, but instead of linking the links together, we have two lines:
ZVAL_DEREF(var_ptr); SEPARATE_ZVAL_NOREF(var_ptr);
The first line says: "If this is a link, follow its purpose"; this leads us out of our "reference to the whole" to the whole itself. The second - I think - says: "if he counted something and has several links, create a copy of it"; in our case, this will not do anything, because the integer value does not matter for refcounts.
So now we have an integer that we can reduce, which will affect all link associations, but not the values for refcounted types. Finally, if we want to return the value of the increment, we will copy it again, and not just assign it; and this time with a slightly different macro, which will increase the number of our new zval if necessary:
ZVAL_COPY(EX_VAR(opline->result.var), var_ptr);