Search for html element id based on display text

Given the following html:

<div id="f52_lblQuestionWording" title="" style="width:auto;height:auto; display: inline; overflow: hidden;" >Home telephone</div> 

I want to automatically get the identifier of the container div element using the string "Home Phone", does anyone know how I can do this with a regular expression?

The string to search for the identifier is not always the same, and html is dynamically generated, so from time to time it may vary slightly. I am working on automation of user interface testing in a company project using Selenium.

Thanks.

+1
source share
3 answers

XPath is the easiest way to get values ​​from XML and HTML documents (provided they are well-formed).

You want this expression:

 //div[text() = 'Home telephone']/@id 

What reads: "Find all the divs whose text value is" Home Phone "and return the id attribute for everything that matches."

Depending on your language, there are usually several built-in or third-party (and free) XPath interpreters available.

It's a good idea to parse HTML with regular expressions, because HTML is not a common language. Regular expressions cannot handle even the simplest cases of HTML cross-code, as regular expressions cannot handle nesting correctly. HTML is an embedded nested structure.

+1
source

I'm not sure what you mean by using the "Home Phone" line, but here are a few ways to do this:

 /id=(.*?)\s+.*(?=Home telephone)/ 

where (? =) the construct is a positive look if the programming language supports it.

Another way is just grep for the home phone and then grab the id value using awk or sed

0
source

In C #, you will create a regular expression that looks like this:

 string elementText = "Home\\stelephone"; // you can change this as needed Regex regex = new Regex( "id=\"(.*?)\"\\s+.*(?="+ elementText +")", RegexOptions.IgnoreCase | RegexOptions.CultureInvariant | RegexOptions.IgnorePatternWhitespace | RegexOptions.Compiled ); // Capture all Matches in the InputText MatchCollection ms = regex.Matches(InputText); 

InputText will be your html file open for reading.

0
source

Source: https://habr.com/ru/post/1387515/


All Articles