Scrapy: next button uses javascript

I am trying to clear this site and I want to get all the RNs in it ... I can clear the data but cannot go to the next page because of its javascript. I tried reading other questions, but I do not understand. This is my code.

class MySpider(CrawlSpider): name = "commu" allowed_domains = [""] start_urls = ["", ] rules = (Rule (SgmlLinkExtractor(allow=('\d+'),restrict_xpaths=('*')) , callback="parse_items", follow= True), ) 

the next button shows how

 <a href="Javascript: Move('next')">Next</a> 

This pagination is killing me ...

source share
1 answer

In short, you need to find out what Move('next') does and plays this in your code.

A quick look at the sites shows that the function code is as follows:

 function Move(strIndicator) { document.frm.move_indicator.value = strIndicator; document.frm.submit(); } 

And document.frm is a form called "frm":

 <form name="frm" action="joblist.asp" method="post"> 

So basically you need to build a request to execute POST for this form with a value of move_indicator as 'next' . This is easy to do using the FormRequest class ( see documents ), for example:

 return FormRequest.from_response(response, formname="frm", formdata={'move_indicator': 'next'}) 

This method works in most cases. The hard part is figuring out what javascript code does, sometimes it can be confusing and do things too complicated to avoid scratches.



All Articles