Return data from HTMLParser handle_starttag

My question is a simpler version .

I have an iframe iframe:

<iframe width="560" height="315" src="//www.youtube.com/embed/fY9UhIxitYM" frameborder="0" allowfullscreen></iframe>

I am working on a small web application and have to extract random code (in this case fY9UhIxitYM). I want to use the standard library and not import Beautiful Soup.

from HTMLParser import HTMLParser

class YoutubeLinkParser(HTMLParser):
    def __init__(self):
        HTMLParser.__init__(self)
        self.data = []

    def handle_starttag(self, tag, attrs):
        data = attrs[2][1].split('/')[-1]
        self.data.append(data)

iframe = open('iframe.html').read()
parser = YoutubeLinkParser()
linkCode = parser.feed(iframe)

The examples I found use handle_data (self, data), but I need the attr information for the open tag. I can print the value in the method, but when I try to get the return value, linkCode returns "none".

What am I missing? Thank!

+4
source share
1 answer

feed() - None. data feed():

from HTMLParser import HTMLParser

class YoutubeLinkParser(HTMLParser):
    def handle_starttag(self, tag, attrs):
        self.data = attrs[2][1].split('/')[-1]

iframe = open('iframe.html').read()
parser = YoutubeLinkParser()
parser.feed(iframe)
print parser.data

fY9UhIxitYM
+3

Source: https://habr.com/ru/post/1544802/


All Articles