XPath is the easiest way to get values ββfrom XML and HTML documents (provided they are well-formed).
You want this expression:
//div[text() = 'Home telephone']/@id
What reads: "Find all the divs whose text value is" Home Phone "and return the id attribute for everything that matches."
Depending on your language, there are usually several built-in or third-party (and free) XPath interpreters available.
It's a good idea to parse HTML with regular expressions, because HTML is not a common language. Regular expressions cannot handle even the simplest cases of HTML cross-code, as regular expressions cannot handle nesting correctly. HTML is an embedded nested structure.
source share