You need to pass the header User-Agent
in order for it to work:
r = requests.get(url, headers={"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.103 Safari/537.36"})
To get script
what you want, you can check for the presence swc_market_lists
in the text:
script = tree.xpath('//script[contains(., "swc_market_lists")]/text()')[0]
print(script)
To extract the value of a variable swc_market_lists
:
import re
data = re.search(r"var swc_market_lists = (.*?);$", script).group(1)
print(data)
, , json.loads()
Python:
import json
data = json.loads(data)