Beautifulsoup for navigating divs by attribute without findAll

How to find a specific div by calling soup attributes? those. something like soup.html.body.div , however I don’t see how you can get a specific div with id='idname' ?

I can do soup.findAll(id='idname')[0] to get a specific tag, but as I understand it, this is a search for the whole soup.

I assume getting the div attribute over the soup will be faster since you are not using findAll() ?

Firebug reports the location as html.body.div[2].form.table[2].tbody.tr[3]... however executing soup.html.body.div[2] gives a key error.

Update:

Say you want to grab the button , I'm in luck , http://www.google.com , tells firebug that the quality are:

/html/body/center/span/center/div[2]/form/div[2]/div[3]/center/input[2]

Is there a way to achieve this without using findAll ?

+4
source share
2 answers

The path you get from Firebug is an XPath expression. It is best to use a parser that allows you to use xpath directly. I like to use lxml with its etree interface:

 from lxml import etree tree = etree.parse(yourfile) lucky = tree.xpath('/html/body/center/span/center/div[2]/form/div[2]/div[3]/center/input[2]') 
+3
source

There is a findChildren method that gets most of the way.

This is equivalent to:

 findAll(tagname, recursive=False) 

which usually makes it much more effective.

Thus, your example will be as follows:

 soup.html.body.center.span.center.findChildren('div')[2].\ form.findChildren('div')[2].findChildren('div')[3].\ center.findChildren('input')[2] 
+1
source

Source: https://habr.com/ru/post/1390856/


All Articles