Perl shares interesting behavior

can someone explain this weird behavior:

I have a hava path in a string and I want to split it for every backslash

my $path = "D:\Folder\AnotherFolder\file.txt"; my @folders = split('\', $path); 

in the above case, it will not work even when exiting the backslash:

 my @folders = split('\\', $path); 

but in case of regex, it will work:

 my @folders = split( /\\/, $path); 

why is that?

+6
source share
4 answers

I think amon gave a better literal answer to your question in his comment:

more explicitly: strings and regular expressions have different rules for escaping. If string is used instead of regular expression, string literals suffer from double escaping

The split '\\' value split '\\' uses a string, and split /\\/ uses a regular expression.

As a practical answer, I would like to add this:

Perhaps you should consider using a module suitable for splitting paths. File::Spec is the main module in Perl 5. In addition, you need to avoid double-backslash, which you haven't done yet. You can also use single quotes, which look a little better in my opinion.

 use strict; use warnings; use Data::Dumper; use File::Spec; my $path = 'D:\Folder\AnotherFolder\file.txt'; # note the single quotes my @elements = File::Spec->splitdir($path); print Dumper \@elements; 

Output:

 $VAR1 = [ 'D:', 'Folder', 'AnotherFolder', 'file.txt' ]; 
+5
source

If you look at the documentation by running:

 perldoc -f split 

You will see three forms of arguments that split can accept:

 split /PATTERN/,EXPR,LIMIT split /PATTERN/,EXPR split /PATTERN/ 

This means that even when you pass a split string as the first argument, perl forces it to be a regular expression.

If we look at the warnings we get when we try to do something like this in re.pl :

 $ my $string_with_backslashes = "Hello\\there\\friend"; Hello\there\friend $ my @arry = split('\\', $string_with_backslashes); Compile error: Trailing \ in regex m/\/ at (eval 287) line 6. 

we see that, firstly, '\\' interpolated as a backslash escape, followed by the actual backslash, which is evaluated by a single backslash.

split then puts the backslash that we gave, and forces it to regex, as if we wrote:

 $ my @arry = split(/\/, $string_with_backslashes); 

which does not work because there is only one backslash that is interpreted as simply escaping the forward slash after it (without the presence of a trailing / ) to show that the regular expression has ended.

+2
source

One of the simpler ways to extract path elements is to extract all sequences of characters other than the path separator.

 use strict; use warnings; my $path = 'D:\Folder\AnotherFolder\file.txt'; my @path = $path =~ m([^/\\]+)g; print "$_\n" for @path; 

Output

 D: Folder AnotherFolder file.txt 
+2
source

When split used in the form of a split STRING rather than a split REGEX , the string is converted to a regular expression. In your case, split '\\' will be converted to split /\/ , since the first backslash is considered an escape character.

The correct way to do this is split '\\\\' , which will be translated to split /\\/ .

+2
source

Source: https://habr.com/ru/post/954940/


All Articles