I am actually working on a crawler using scrapy in python, and I am almost done, I just have a little problem. Website using this page:
<div class="pagination toolbarbloc">
<ul>
<li class="active"><span>1</span></li>
<li><a href="...">2</a></li>
<li><a href="...">3</a></li>
<li><a href="...">4</a></li>
<li><a href="...">5</a></li>
<li><a class="end" href="...">>></li>
</ul>
</div>
So, I'm trying to catch "href" on balise li right after li with class "active".
I am trying something like this:
next_page_url_xpath = '//div[@class="pagination toolbarbloc"]/ul/following-sibling::li[@class="active"]/a/@href'
but this did not work: IndexError: index index out of range
I am just starting with xpath, and I know that it is simple, but after you read a lot of documents, I was unsuccessful.
Many thanks to those who help me!
source
share