I want to use proxy IP for web cleaning using scrapy. To use the proxy server, I set the environment variable http_proxyas indicated in the documentation.
$ export http_proxy=http://proxy:port
To check if the IP change works, I created a new spider with a name test:
from scrapy.spider import BaseSpider
from scrapy.contrib.spiders import CrawlSpider, Rule
class TestSpider(CrawlSpider):
name = "test"
domain_name = "whatismyip.com"
start_urls = ["http://whatismyip.com"]
def parse(self, response):
print response.body
open('check_ip.html', 'wb').write(response.body)
but if I run this spider, it check_ip.htmldoes not display IP, as indicated in the environment variable, rather, it shows the source IP address, as it was before the scan.
What is the problem? Is there an alternative way that I can check if I use a proxy server or not? or is there any other way to use proxy IP?
Vipul source
share