TmeplateMakerreally does what you need, at least according to its documentation. Instead of receiving the template as input, it outputs ("learns") if from multiple documents. It then has a method extractfor extracting data from other documents that were created using this template.
This example shows:
>>> t.extract('<b>red and green</b>')
('red', 'green')
>>> t.extract('<b>django and stephane</b>')
('django', 'stephane')
>>> t.extract('<b> spacy and <u>underlined</u></b>')
(' spacy ', '<u>underlined</u>')
>>> t.extract('this and that')
Traceback (most recent call last):
...
, , , :
, , Perl Template::Extract, , - .