Python int variable url add to string

Question

Python int variable url add to string

pgno = 1 while pgno < 4304: result = urllib.urlopen("http://www.example.comtraderesourcespincode.aspx?" + "&GridInfo=Pincode0"+ pgno) print pgno html = result.read() parser = etree.HTMLParser() tree = etree.parse(StringIO.StringIO(html), parser) pgno += 1

in http://.......=Pincode0 I need to add 1..for for example, like "Pincode01", quote it 01 to 02, 03 .. for which I use a while loop, and the variable assigned is "pgno".

The problem is that the counter adds 1, but "Pincode01" does not become "Pincode02" ... therefore it does not open the second page of the site.

I even tried +str(pgno)) ... no luck.

Please show how to do this. I can’t do it ... and tried several times to do it.

-3

variables python string http url

NCS Aug 27 '11 at 10:31

source share

3 answers

eyquem · Answer 1 · 2011-08-27T10:59:31+0000

Perhaps you want this:

 from urllib import urlopen import re pgno = 2 url = "http://www.eximguru.com/traderesources/pincode.aspx?&amp;GridInfo=Pincode0%s" %str(pgno) print url +'\n' sock = urlopen(url) htmlcode = sock.read() sock.close() x = re.search('%;"><a href="javascript:__doPostBack',htmlcode).start() pat = ('\t\t\t\t<td style="width:\d+%;">(\d+)</td>' '<td style="width:\d+%;">(.+?)</td>' '<td style="width:\d+%;">(.+?)</td>' '<td style="width:30%;">(.+?)</td>\r\n') regx = re.compile(pat) print '\n'.join(map(repr,regx.findall(htmlcode,x)))

result

 http://www.eximguru.com/traderesources/pincode.aspx?&amp;GridInfo=Pincode02 ('110001', 'New Delhi', 'Delhi', 'Baroda House') ('110001', 'New Delhi', 'Delhi', 'Bengali Market') ('110001', 'New Delhi', 'Delhi', 'Bhagat Singh Market') ('110001', 'New Delhi', 'Delhi', 'Connaught Place') ('110001', 'New Delhi', 'Delhi', 'Constitution House') ('110001', 'New Delhi', 'Delhi', 'Election Commission') ('110001', 'New Delhi', 'Delhi', 'Janpath') ('110001', 'New Delhi', 'Delhi', 'Krishi Bhawan') ('110001', 'New Delhi', 'Delhi', 'Lady Harding Medical College') ('110001', 'New Delhi', 'Delhi', 'New Delhi Gpo') ('110001', 'New Delhi', 'Delhi', 'New Delhi Ho') ('110001', 'New Delhi', 'Delhi', 'North Avenue') ('110001', 'New Delhi', 'Delhi', 'Parliament House') ('110001', 'New Delhi', 'Delhi', 'Patiala House') ('110001', 'New Delhi', 'Delhi', 'Pragati Maidan') ('110001', 'New Delhi', 'Delhi', 'Rail Bhawan') ('110001', 'New Delhi', 'Delhi', 'Sansad Marg Hpo') ('110001', 'New Delhi', 'Delhi', 'Sansadiya Soudh') ('110001', 'New Delhi', 'Delhi', 'Secretariat North') ('110001', 'New Delhi', 'Delhi', 'Shastri Bhawan') ('110001', 'New Delhi', 'Delhi', 'Supreme Court') ('110002', 'New Delhi', 'Delhi', 'Rajghat Power House') ('110002', 'New Delhi', 'Delhi', 'Minto Road') ('110002', 'New Delhi', 'Delhi', 'Indraprastha Hpo') ('110002', 'New Delhi', 'Delhi', 'Darya Ganj')

I wrote this code after studying the structure of the HTML source code with the following code (I think you will understand this without any further explanation):

 from urllib2 import Request,urlopen import re pgno = 2 url = "http://www.eximguru.com/traderesources/pincode.aspx?&amp;GridInfo=Pincode0%s" %str(pgno) print url +'\n' sock = urlopen(url) htmlcode = sock.read() sock.close() li = htmlcode.splitlines(True) print '\n'.join(str(i) + ' ' + repr(line)+'\n' for i,line in enumerate(li) if 275<i<300) ch = ''.join(li[0:291]) from collections import defaultdict didi =defaultdict(int) for c in ch: didi[c] += 1 print '\n\n'+repr(li[289]) print '\n'.join('%r -> %s' % (c,didi[c]) for c in li[289] if didi[c]<35)

.

Now the problem is that the same HTML is returned for all pgno values. A site may find that it is a program that wants to connect and retrieve data. This problem should be handled by tools in urllib2 , but I am not trained in this.

Facundo casco · Answer 2 · 2011-08-27T15:09:24+0000

If your problem is with the number format, use this instead of adding str to int:

 >>> pgno = 1 >>> while pgno < 20: ... print '%02d' % pgno ... pgno += 1 ... 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19

Learn more about dxt format for more options.

Also, in a more pythonic way, using the string format

 >>> for pgno in range(9, 12): ... print '{0:02d}'.format(pgno) ... 09 10 11

Mikko ohtamaa · Answer 3 · 2011-08-27T11:01:01+0000

Cycle:

 pgno = 1 while pgno < 4304: print pgno pgno += 1

It works correctly and the number increases.

You either incorrectly describe the problems or problems arise in your basic assumptions of the problem. Could you try to describe what you are trying to do first?

Python int variable url add to string

More articles: