RegEx function for command line parsing without using a library

I would like to split the string using a space, like my separator, but if there are several words enclosed in double or single quotes, I would like them to be returned as a single element.

For example, if the input line is:

CALL "C:\My File Name With Space" /P1 P1Value /P1 P2Value

The output array will be:

 Array[0]=Call Array[1]=C:\My File Name With Space Array[2]=/P1 Array[3]=P1Value Array[4]=/P1 Array[5]=P2Value 

How do you use regular expressions for this? I understand that there are command line parsers. I glanced at the popular one, but could not cope with a situation where you can have several parameters with the same name. In any case, instead of learning to use the command line parsing library (leave it for another day). I'm interested in learning more about RegEx features.

How would you use the RegEx function to parse it?

+4
source share
3 answers

The link in Jim Michelle's comment indicates that the Win32 API provides a function for this. I would recommend using this for consistency. Here's a sample (from PInvoke ).

 static string[] SplitArgs(string unsplitArgumentLine) { int numberOfArgs; IntPtr ptrToSplitArgs; string[] splitArgs; ptrToSplitArgs = CommandLineToArgvW(unsplitArgumentLine, out numberOfArgs); if (ptrToSplitArgs == IntPtr.Zero) throw new ArgumentException("Unable to split argument.", new Win32Exception()); try { splitArgs = new string[numberOfArgs]; for (int i = 0; i < numberOfArgs; i++) splitArgs[i] = Marshal.PtrToStringUni( Marshal.ReadIntPtr(ptrToSplitArgs, i * IntPtr.Size)); return splitArgs; } finally { LocalFree(ptrToSplitArgs); } } [DllImport("shell32.dll", SetLastError = true)] static extern IntPtr CommandLineToArgvW( [MarshalAs(UnmanagedType.LPWStr)] string lpCmdLine, out int pNumArgs); [DllImport("kernel32.dll")] static extern IntPtr LocalFree(IntPtr hMem); 

If you need a quick and inflexible, inflexible, fragile regex, you can do something like this:

 var rex = new Regex(@"("".*?""|[^ ""]+)+"); string test = "CALL \"C:\\My File Name With Space\" /P1 P1Value /P1 P2Value"; var array = rex.Matches(test).OfType<Match>().Select(m => m.Groups[0]).ToArray(); 
+10
source

I would not do this with Regex for the various reasons shown above.

If I needed this, it would meet your simple requirements:

 (".*?")|([^ ]+) 

However, this does not include:

  • Excluded quotes
  • Single quotes
  • non-ascii quotes (don't you think that people will insert smart quotes from a word into your file?)
  • combinations of the above

And it's just from the head.

+1
source

@chad Henderson, you forgot to include single quotes, and this also has the problem of capturing everything that comes before a set of quotes.

there is a fix that includes single quotes, but also shows a problem with extra capture before the quote. http://regexhero.net/tester/?id=81cebbb2-5548-4973-be19-b508f14c3348

+1
source

Source: https://habr.com/ru/post/1485674/


All Articles