Is it possible / practical to build one regular expression that matches hierarchical data?
For instance:
<h1>Action</h1>
<h2>Title1</h2><div>data1</div>
<h2>Title2</h2><div>data2</div>
<h1>Adventure</h1>
<h2>Title3</h2><div>data3</div>
I would like to get a match.
"Action", "Title1", "data1"
"Action", "Title2", "data2"
"Adventure", "Title3", "data3"
As I see it, this will require knowledge that there is a hierarchical structure, and if I encode a template for capturing H1, it will correspond only to the first record of this hierarchy. If I am not a code for H1, then I cannot capture it. I wonder if there are any special tricks that I use to solve this problem.
This is a .NET project.
source
share