Search for html element id based on display text

Question

Search for html element id based on display text

Given the following html:

<div id="f52_lblQuestionWording" title="" style="width:auto;height:auto; display: inline; overflow: hidden;" >Home telephone</div>

I want to automatically get the identifier of the container div element using the string "Home Phone", does anyone know how I can do this with a regular expression?

The string to search for the identifier is not always the same, and html is dynamically generated, so from time to time it may vary slightly. I am working on automation of user interface testing in a company project using Selenium.

Thanks.

+1

regex

user228178 Dec 9 '09 at 18:50

source share

3 answers

I'm not sure what you mean by using the "Home Phone" line, but here are a few ways to do this:

 /id=(.*?)\s+.*(?=Home telephone)/

where (? =) the construct is a positive look if the programming language supports it.

Another way is just grep for the home phone and then grab the id value using awk or sed

0

ennuikiller Dec 9 '09 at 18:55

source share

In C #, you will create a regular expression that looks like this:

 string elementText = "Home\\stelephone"; // you can change this as needed Regex regex = new Regex( "id=\"(.*?)\"\\s+.*(?="+ elementText +")", RegexOptions.IgnoreCase | RegexOptions.CultureInvariant | RegexOptions.IgnorePatternWhitespace | RegexOptions.Compiled ); // Capture all Matches in the InputText MatchCollection ms = regex.Matches(InputText);

InputText will be your html file open for reading.

0

ddc0660 Dec 9 '09 at 19:36

source share

Welbog · Accepted Answer · 2009-12-09T19:23:14+0000

XPath is the easiest way to get values from XML and HTML documents (provided they are well-formed).

You want this expression:

 //div[text() = 'Home telephone']/@id

What reads: "Find all the divs whose text value is" Home Phone "and return the id attribute for everything that matches."

Depending on your language, there are usually several built-in or third-party (and free) XPath interpreters available.

It's a good idea to parse HTML with regular expressions, because HTML is not a common language. Regular expressions cannot handle even the simplest cases of HTML cross-code, as regular expressions cannot handle nesting correctly. HTML is an embedded nested structure.

Search for html element id based on display text

More articles: