Fix webpage character encoding with python Mechanize

I am trying to submit a form on this page using Mechanize.

br.open("http://mspc.bii.a-star.edu.sg/tankp/run_depth.html") #selecting form to fill br.select_form(nr = 0) #input for the form br['pdb_id'] = '1atp' req = br.submit() 

However, this gives the following error:

 mechanize._form.ParseError: expected name token at '<! INPUT PDB FILE>\n\t' 

I believe this is due to some inappropriate character encoding (ref) . I would like to know how to fix this.

+6
source share
1 answer

There is some broken HTML comment tags in your problem leading to an invalid website that the mechanized parser cannot read. But you can use the included BeautifulSoup parser , which works in my case (Python 2.7.9, mechanize 0.2.5):

 #!/usr/bin/env python #-*- coding: utf-8 -*- import mechanize br = mechanize.Browser(factory=mechanize.RobustFactory()) br.open('http://mspc.bii.a-star.edu.sg/tankp/run_depth.html') br.select_form(nr=0) br['pdb_id'] = '1atp' response = br.submit() 
+2
source

Source: https://habr.com/ru/post/988133/


All Articles