Why is this recorder stuck?

I am very puzzled. I have an HTML block that I popped up from a large table. It looks something like this:

<td align="left" class="page">Number:\xc2\xa0<a class="topmenu" href="http://www.example.com/whatever.asp?search=724461">724461</a> Date:\xc2\xa01/1/1999 Amount:\xc2\xa0$2.50 <br/>Person:<br/><a class="topmenu" href="http://www.example.com/whatever.asp?search=LAST&amp;searchfn=FIRST">LAST,\xc2\xa0FIRST </a> </td> 

(Actually, it looked worse, but I reworked the lines many times)

I need to print the lines and split the Date / Amount line. It seemed like a place to start was to find the children of this HTML block. A block is a string because it returned me a regular expression. So I did:

 text_soup = BeautifulSoup(text) text_children = text_soup.find('td').childGenerator() 

I can iterate over child elements with

 for i,each in enumerate(text_soup.find('td').childGenerator()): print type(each) print i, ":", each 

but not with

 for i, each in enumerate(text_children): ...etc 

They must be the same. Therefore, I am confused.

0
source share
2 answers

gnibbler correctly explains that you could only consume generators once. Just to explain further:

According to docs an iterator , this is an object that represents a data stream. Since you are already consuming a stream (i.e. you will reach the end of the stream), repeating this will not result in any data. I had the same problem, but Karl Knechtel comment cleared me up. I hope my explanation is clear.

0
source

The BeautifulSoup childGenerator() method returns an iterator object via python built-in to the iter() function. The iterator has a .next() method that returns the next element or raises StopIteration when the element ends.

An enumerator is a special type of iterator. It also has a .next() method, but instead of returning only the next value, it returns a tuple containing the counter and the next value.

Your for loop takes two arguments ( i and each ), so the python interpreter expects the iterator to supply a two-element tuple. If you only pass the iterator provided by childGenerator() , python has only one element available, and not two that it needs and suffocates. However, if you create an enumerator from an iterator via enumerate() , then the interpreter gets the required two-element tuple.

+1
source

Source: https://habr.com/ru/post/1447508/


All Articles