In perl, when assigning a variable the return value of a subroutine, is the data duplicated in memory?

sub foo { my @return_value = (1, 2); } my @receiver = foo(); 

Is this destination like any other destination in perl? is the array duplicated in memory? I doubt the reason for this is that the array contained in the subroutine is disposable, duplication is completely redundant. it makes sense to just "link" the array to @receiver for optimization.

By the way, I noticed a similar Perl question : does the function return a link or copy? but didn’t get what I want.

I'm talking about Perl5

ps. any books or materials on such topics about Perl?

+5
source share
2 answers

Scalars returned under :lvalue subs are not copied.

Scalars returned by XS substrates are not copied.

Scalars returned by a function (named statements) are not copied.

Scalars returned by other subsets are copied.

But this is before any task comes into play. If you assign the return values ​​to a variable, you will copy them (again, in the case of a normal Perl substring).

This means that my $y = sub { $x }->(); copies $x twice!

But this does not really matter due to optimization.


Let's start with an example when they are not copied.

 $ perl -le' sub f :lvalue { my $x = 123; print \$x; $x } my $r = \f(); print $r; ' SCALAR(0x465eb48) # $x SCALAR(0x465eb48) # The scalar on the stack 

But if you remove :lvalue ...

 $ perl -le' sub f { my $x = 123; print \$x; $x } my $r = \f(); print $r; ' SCALAR(0x17d0918) # $x SCALAR(0x17b1ec0) # The scalar on the stack 

Even worse, you should usually follow by assigning a scalar to a variable, so a second copy occurs.

 $ perl -le' sub f { my $x = 123; print \$x; $x } my $r = \f(); # \ print $r; # > my $y = f(); my $y = $$r; # / print \$y; ' SCALAR(0x1802958) # $x SCALAR(0x17e3eb0) # The scalar on the stack SCALAR(0x18028f8) # $y 

On the plus side, the assignment is optimized to minimize the cost of copying strings.

XS subs and functions (called operators) usually return mortals ("TEMP") scalars. These are death row scalars. They will be automatically destroyed if nothing is done to require a link to them.

In older versions of Perl (<5.20), assigning a lethal string to another scalar will result in the transfer of the string buffer to avoid the need to copy the string buffer. For example, my $y = lc($x); Does not copy the line created by lc ; just copied the line pointer.

 $ perl -MDevel::Peek -e'my $s = "abc"; Dump($s); $s = lc($s); Dump($s);' SV = PV(0x1705840) at 0x1723768 REFCNT = 1 FLAGS = (PADMY,POK,IsCOW,pPOK) PV = 0x172d4c0 "abc"\0 CUR = 3 LEN = 10 COW_REFCNT = 1 SV = PV(0x1705840) at 0x1723768 REFCNT = 1 FLAGS = (PADMY,POK,pPOK) PV = 0x1730070 "abc"\0 <-- Note the change of address from stealing CUR = 3 the buffer from the scalar returned by lc. LEN = 10 

In newer versions of Perl (? 5.20), the assignment operator never [1] copies the string buffer. Instead, newer versions of Perl use the copy-on-write ("COW") mechanism.

 $ perl -MDevel::Peek -e'my $x = "abc"; my $y = $x; Dump($x); Dump($y);' SV = PV(0x26b0530) at 0x26ce230 REFCNT = 1 FLAGS = (POK,IsCOW,pPOK) PV = 0x26d68a0 "abc"\0 <----+ CUR = 3 | LEN = 10 | COW_REFCNT = 2 +-- Same buffer (0x26d68a0) SV = PV(0x26b05c0) at 0x26ce248 | REFCNT = 1 | FLAGS = (POK,IsCOW,pPOK) | PV = 0x26d68a0 "abc"\0 <----+ CUR = 3 LEN = 10 COW_REFCNT = 2 

Well, so far I have only talked about scalars. Well, this is because subs and functions can only return scalars [2] .

In your example, the scalar assigned by @return_value will be returned [3] copied, and then copied to @receiver second time.

You can avoid all this by returning an array reference.

 sub f { my @fizbobs = ...; \@fizbobs } my $fizbobs = f(); 

The only thing copied is the link, the simplest scalar undefined.


  • Well maybe never. I think there should be a free byte in the line buffer to hold the COW count.

  • In the context of a list, they can return 0, 1, or many of them, but they can only return scalars.

  • The last statement of your unit is the list assignment operator. In the context of a list, the list assignment operator returns the scalars on which its left side is evaluated (LHS). For more information, see Scalar vs List Assignment Operator .

+6
source

The routine returns the result of the last operation if you did not specify an explicit return.

@return_value is created separately from @receiver , and the values ​​are copied, and the memory used by @return_value is freed when it leaves the scope when the subroutine exits.

So, yes - the used memory is duplicated.

If you desperately want to avoid this, you can create an anonymous array once and "pass" a link to it:

 #!/usr/bin/env perl use strict; use warnings; use Data::Dumper; sub foo { my $anon_array_ref = [ 1, 2 ]; return $anon_array_ref; } my $results_from_foo = foo(); print Dumper $results_from_foo; 

This will usually be a premature optimization if you do not know that you are dealing with really large data structures.

Note. You should probably include an explicit return; to your unit after the appointment, as it’s good practice to clearly understand what you are doing.

+3
source

Source: https://habr.com/ru/post/1273940/


All Articles