The best way to parse a text file with a nested information structure

A text file contains hundreds of these records (format - MT940 bank application)

{1:F01AHHBCH110XXX0000000000}{2:I940X N2}{3:{108:XBS/091502}}{4: :20:XBS/091202/0001 :25:5887/507004-50 :28C:140/1 :60F:C0914CHF7789, :61:0912021202D36,80NTRFNONREF//0887-1202-29-941 04392579-0 LUTHY + xxx, ZUR :86:6034?60LUTHY + xxxx, ZUR vom 01.12.09 um 16:28 Karten-Nr. 2232 2579-0 :62F:C091202CHF52,2 :64:C091302CHF52,2 -} 

This should go into a hash array, e.g.

 [{"1"=>"F01AHHBCH110XXX0000000000"}, "2"=>"I940X N2", 3 => {108=>"XBS/091502"} etc. } ] 

I tried it with the top of the tree, but it didn't seem to be the right way, because it is more for something you want to do, and I just need the information.

 grammar Mt940 rule document part1:string spaces [:|/] spaces part2:document { def eval(env={}) return part1.eval, part2.eval end } / string / '{' spaces document spaces '}' spaces { def eval(env={}) return [document.eval] end } end end 

I also tried with regex

 matches = str.scan(/\A[{]?([0-9]+)[:]?([^}]*)[}]?\Z/i) 

but it is difficult with recursion ...

How can I solve this problem?

+2
source share
1 answer

There are several MT940 open source parsers available in Java and PHP. You can view the source code and port it to Ruby. If you are in JRuby, you can use the java parser in your ruby ​​code.

Another option is to use the OFX gem . Gem parses OFX files. Since your file is in MT940 format, you need to convert the file to OFX format using one of the available free converters. This approach is practical if you import a batch job, etc.

Link

MT940 Java Parser.

MT940 to OFX 1 Converter

MT940 to OFX 2 Converter

+2
source

Source: https://habr.com/ru/post/910757/


All Articles