Analysis of structured text data in PHP

Question

Analysis of structured text data in PHP

I am looking for various (better) ways to analyze structured text data in PHP and get this data in a PHP object graph. I have seen many different parsers in PHP for various text file formats, but pretty much all of them seem to be some kind of fragile chain of regular expressions. There must be a better way!

In this particular case, I am looking to analyze MT940 files (bank account transactions). But I ran into the same problem with other file formats. Invariably, I get a large chain of regular expressions, which becomes difficult to maintain, especially when you need to support various formats. MT940 also has this problem. MT940 is not a strictly defined format, and almost all banks use a slightly different dialect.

So, how do you develop parsers that are more reliable and extensible to work with different dialects?

Here is an example MT940 statement taken from this question :

{1:F01AHHBCH110XXX0000000000}{2:I940X N2}{3:{108:XBS/091502}}{4: :20:XBS/091202/0001 :25:5887/507004-50 :28C:140/1 :60F:C0914CHF7789, :61:0912021202D36,80NTRFNONREF//0887-1202-29-941 04392579-0 LUTHY + xxx, ZUR :86:6034?60LUTHY + xxxx, ZUR vom 01.12.09 um 16:28 Karten-Nr. 2232 2579-0 :62F:C091202CHF52,2 :64:C091302CHF52,2 -}

+6

php parsing mt940

Sander marechal Mar 14 '12 at 10:37

source share

1 answer

webbiedave · Accepted Answer · 2012-03-14T22:40:37+0000

You can use this free parser (GPL 2.0):

http://www.kingsquare.nl/php-mt940

Here is another:

http://www.butcher.art.pl/en/2010/09/tutoriale/parser-php-mt940-format-wyciagow-bankowych/

Hopefully this will allow you to abandon dragging the wheels on this one.

So, how do you design parsers that are more robust and extensible to communicate with different dialects?

Unfortunately, there is no easy answer to this question. You will have to hide and get acquainted with all the options that you want to support. On the page of the royal square:

The parser tries to determine from which bank of origin it is sent through the first few lines of the file, and then loads the engine into the bank.

This will require a lot of experience and knowledge. Fortunately, their code can help you to a great extent.

Analysis of structured text data in PHP

More articles: