I just installed lxml using easy_install on a Ubuntu12.04 machine with Python 3.2.3 installed. lxml is the latest version of 3.0Alpha.
I tried the following code:
import lxml.html def proc_tweet(ss): html=lxml.html.fragment_fromstring(ss) ps=html.xpath("//p[@node-type='feed_list_content']") def test(): ss='' f=open('test') for l in f: ss+=l.strip() f.close() while True: proc_tweet(ss) if __name__=='__main__': test()
Here "test" is a file containing a short piece of HTML:
<dl action-type="feed_list_item" mid="3409553360609821" class="feed_list W_linecolor"> <dd class="content"> <p node-type="feed_list_content">This is a drill.</p> </dd> <dd class="clear"></dd> </dl>
Problem: lxml eats all my memory from time to time. I tried this
del ps del html
This does not work. Does anyone know why?
source share