Pagination - xpath for crawler in python

I am actually working on a crawler using scrapy in python, and I am almost done, I just have a little problem. Website using this page:

<div class="pagination toolbarbloc">
            <ul>
                    <li class="active"><span>1</span></li>
                    <li><a href="...">2</a></li>
                    <li><a href="...">3</a></li>
                    <li><a href="...">4</a></li>
                    <li><a href="...">5</a></li>
                    <li><a class="end" href="...">>></li>
            </ul>
        </div>

So, I'm trying to catch "href" on balise li right after li with class "active".

I am trying something like this:

next_page_url_xpath = '//div[@class="pagination toolbarbloc"]/ul/following-sibling::li[@class="active"]/a/@href'

but this did not work: IndexError: index index out of range

I am just starting with xpath, and I know that it is simple, but after you read a lot of documents, I was unsuccessful.

Many thanks to those who help me!

+4
source share
1 answer

Try the following expression:

//div[@class="pagination toolbarbloc"]/ul/li[@class="active"]/following-sibling::li/a/@href

, @ [class="pagination toolbarbloc"] li ul

0

Source: https://habr.com/ru/post/1685422/


All Articles