I need to parse log files containing FIX protocol messages.
Each row contains header information (timestamp, logging level, endpoint), and then the FIX payload.
I used regex to parse header information in named groups. For instance:.
<?P<datetime>\d{2}/\d{2}/\d{2} \d{2}:\d{2}:\d{2}.\d{6}) (?<process_id>\d{4}/\d{1,2})\s*(?P<logging_level>\w*)\s*(?P<endpoint>\w*)\s*
Then I come to the FIX payload (^ A is the separator between each tag), for example:
8=FIX.4.2^A9=61^A35=A...^A11=blahblah...
I need to extract certain tags from it (for example, “A” from 35 = or “blah-blah” from 11 =) and ignore all other things - basically I need to ignore anything until “35 = A” and anything after "11 = blah blah", then ignore anything after that, etc.
I know there are libraries that could parse each tag (http://source.kentyde.com/fixlib/overview), however I was hoping for a simple approach using regex here if possible, since I really only need a few tags .
Is there a good way in regex to retrieve the tags I need?
Cheers, Victor
source share