Beautiful Soup - identify a tag based on the position next to the comment

I am using Beautiful Soup.

Is there a way that I can get a tag based on its position next to the comment (something is not included in the parse tree)?

For example, let's say I have ...

<html> <body> <p>paragraph 1</p> <p>paragraph 2</p> <!--text--> <p>paragraph 3</p> </body> </html> 

In this example, how can I define <p>paragraph 2</p> , given that I'm looking for the comment " <!--text--> "?

Thanks for any help.

+4
source share
1 answer

Comments are displayed in the BeautifulSoup syntax tree, like any other node. For example, to find a comment with some comment text , and then print the previous <p> element that you could do:

 from BeautifulSoup import BeautifulSoup, Comment soup = BeautifulSoup('''<html> <body> <p>paragraph 1</p> <p>paragraph 2</p> <!--some comment text--> <p>paragraph 3</p> </body> </html>''') def right_comment(e): return isinstance(e, Comment) and e == 'some comment text' e = soup.find(text=right_comment) print e.findPreviousSibling('p') 

... which will be printed:

 <p>paragraph 2</p> 
+6
source

Source: https://habr.com/ru/post/1342878/


All Articles