Extract numbers from String

I need to parse String to create a PathSegmentCollection . A line consists of numbers separated by comas and / or (any) spaces (for example, a new line, tab, etc.), and numbers can also be written using scientific notation.

This is an example: "9.63074,9.63074 -5.55708e-006 0 ,0 1477.78"

And the points: P1 (9.63074, 9.63074), P2 (-0.555708, 0), P3 (0, 1477.78)

To extract numbers, I use a regex:

 Dim RgxDouble As New Regex("[+-]?\b[0-9]+(\.[0-9]+)?(e[+-]?[0-9]+)?\b") Dim Matches As MatchCollection = RgxDouble.Matches(.Value) Dim PSegmentColl As New PathSegmentCollection Dim PFigure As New PathFigure With Matches If .Count < 2 OrElse .Count Mod 2 <> 0 Then Exit Sub PFigure.StartPoint = New Point(.Item(0).Value, .Item(1).Value) For i As UInteger = 2 To .Count - 1 Step 2 Dim x As Double = .Item(i).Value, y As Double = .Item(i + 1).Value PSegmentColl.Add(New LineSegment With {.Point = New Point(x, y)}) Next End With 

This works, but I have to parse about a hundred thousand (or more) lines, and this way is too slow. I want to find a more effective solution, whereas: in most cases, numbers are not written in scientific notation, and if you think the best way, I have no problem using an assembly written in C ++ / CLI that uses C / C ++ Unmanaged code or unsafe C # code.

+1
source share
1 answer

Why are you trying to parse path markup syntax yourself? This is a complex thing and, possibly, an item that will be changed (at least expanded) in the future. WPF can do this for you: http://msdn.microsoft.com/en-us/library/system.windows.media.geometry.parse.aspx , so it’s better to enable the framework.


Edit:
If parsing is your bottleneck, you can try to parse yourself. I would recommend trying the following and checking if this is enough:

 char[] separators = new char[] { ' ', ',' }; // should be created only once var parts = pattern.Split(separators, StringSplitOptions.RemoveEmptyEntries); double firstInPair = 0.0; for (int i = 0; i < parts.Length; i++ ) { double number = double.Parse(parts[i]); if (i % 2 == 0) { firstInPair = number; continue; } double secondInPair = number; // do whatever you want with the pair (firstInPair, secondInPair) ... } 
+2
source

Source: https://habr.com/ru/post/913238/


All Articles