Beautiful Soup - identify a tag based on the position next to the comment

Question

Beautiful Soup - identify a tag based on the position next to the comment

I am using Beautiful Soup.

Is there a way that I can get a tag based on its position next to the comment (something is not included in the parse tree)?

For example, let's say I have ...

<html> <body> <p>paragraph 1</p> <p>paragraph 2</p> <!--text--> <p>paragraph 3</p> </body> </html>

In this example, how can I define <p>paragraph 2</p> , given that I'm looking for the comment "  "?

Thanks for any help.

+4

python beautifulsoup

Kim Mar 08 '11 at 19:27

source share

1 answer

Mark longair · Accepted Answer · 2011-03-08T19:47:58+0000

Comments are displayed in the BeautifulSoup syntax tree, like any other node. For example, to find a comment with some comment text , and then print the previous <p> element that you could do:

 from BeautifulSoup import BeautifulSoup, Comment soup = BeautifulSoup('''<html> <body> <p>paragraph 1</p> <p>paragraph 2</p> <!--some comment text--> <p>paragraph 3</p> </body> </html>''') def right_comment(e): return isinstance(e, Comment) and e == 'some comment text' e = soup.find(text=right_comment) print e.findPreviousSibling('p')

... which will be printed:

 <p>paragraph 2</p>

Beautiful Soup - identify a tag based on the position next to the comment

More articles: