Subpattern
(?<name> .+?)\s+
in your regex, it means "matching and remembering one or more characters other than the new line," but stop as soon as you find spaces, "so $name contains TEST because the pattern no longer matches when it sees the space in front of Box .
The VI Toolkit wiki gives an example Output from the getallvms subcommand:
# vmware-vim-cmd -H 10.10.10.10 -U root -P password / vmsvc / getallvms
Vmid Name File Guest OS Version Annotation
64 bartPE [store] BartPE / BartPE.vmx winXPProGuest vmx-04
96 trustix [store] Trustix / Trustix.vmx otherLinuxGuest vmx-04
The case is slightly different from the example in your question, but it seems that we can search [store] as a bumper to match:
/^(?<id> \d+) \s+ (?<name> .+?) \s+ \[store]/mix
Unwanted quantifier +? means matching one or more of them, but the match wants to transfer control to the others as quickly as possible. Remember that [ has special meaning in regular expressions, but the pattern \[ matches a literal, rather than introducing a character class.
I think of this method as extradition or pulling and pulling. If you want to extract a piece of text that is difficult to characterize, look at the surrounding functions that are easy to match - often as simple as ^ or $ . Then use a stretch pattern to capture everything in between, usually (.+) Or (.+?) . Read the “Quantifiers” section of the perlre documentation for an explanation of the many options.
This fixes the immediate problem, and you can also add varnish in several areas.
Do not use $1 , $2 and friends unconditionally! Always verify pattern matching before using capture variables. for instance
if (/(foo|bar|baz)/) { print "got $1\n"; } else { print "no match\n"; }
Unprotected print $1 can produce unexpected results that are difficult to debug.
The wise use of default values for Perl can help highlight the calculation and allow the mechanism to fade in the background. Dropping $vm in favor of $_ , since the implicit loop variable and the implicit match result make a more enjoyable result.
Your comments are simply translated from Perl into English. The most useful comments explain why, not what. Also keep in mind Rob Pike's advice on comments :
If your code needs a comment that needs to be understood, it’s better to rewrite it to make it easier to understand.
In %+ assignments, quotation marks do nothing useful. The values are already strings, so remove the quotation marks.
my $id = $+{id}; my $name = $+{name};
Below is a modified version of your code that captures everything after the number, but before [store] in $name . utf8 pragma announces that your source code, and not as a common mistake, your input - contains UTF-8. The test below simulates using a canned echo output from vim-cmd to the Swedish VM.
As Tom suggested, I use the Encode module to decode the output that comes through the SSH connection and encode it to the local host before printing it.
The perlunifaq documentation recommends decoding external data into the internal Perl format, and then encoding any output immediately before writing it. I assume that the value returned from $ssh->capture(...) uses UTF-8 encoding, i.e. the remote host sends UTF-8. We see the expected result, because I use the modern Linux distribution and ssh-ing back to it, but in the wild you can deal with some other encoding.
You can get away with skipping calls to decode and encode , because the internal Perl format matches the ones you use. In general, however, cutting corners can cause you problems:
Finally, the code!
#! /usr/bin/env perl use strict; use utf8; use warnings; use Encode; use Net::OpenSSH; my %ssh_options = (); my $ssh = Net::OpenSSH->new('localhost', %ssh_options); # Create an array and capture the ESX\ESXi output from the current server #my @getallvms = $ssh->capture('vim-cmd vmsvc/getallvms'); my @getallvms = $ssh->capture(<<EOEcho); echo -e 'JUNK\n416 TEST Box åäö!"'\\'\\''*# [Store] TEST Box +w6XDpMO2IQ-_''_+Iw/TEST Box +w6XDpMO2IQ _''_+Iw.vmx slesGuest vmx-04' EOEcho shift @getallvms; for (@getallvms) { $_ = decode "utf8", $_, Encode::FB_CROAK; if (/^(?<id> \d+) \s+ (?<name> .+?) \s+ \[store]/mix) { my $id = $+{id}; my $name = $+{name}; print encode("utf8", $id), "\n", encode("utf8", $name), "\n", "\n"; } else { print "no match\n"; } }
Output:
416
TEST Box åäö! "'' * #