I am trying to clear the data from the next page using the lxml module in Python: http://www.thehindu.com/todays-paper/with-afspa-india-has-failed-statute-amnesty/article7376286.ece . I want to get the text in the first paragraph, but the following code returns a null value
from lxml import html
import requests
page = requests.get('http://www.thehindu.com/todays-paper/with-afspa-india-has-failed-statute-amnesty/article7376286.ece')
tree = html.fromstring(page.text)
data = tree.xpath('//*[@id="left-column"]/div[6]/p[1]/text()')
print data
I do not understand what I'm doing wrong here. Please suggest if there are better ways to do what I'm trying to do.
source
share