I have javascript code that just shows the source code of an html page
javascript:h=document.getElementsByTagName('html')[0].innerHTML;function%20disp(h){h=h.replace(/</g,%20'\n<');h=h.replace(/>/g,'>');document.getElementsByTagName('body')[0].innerHTML='<pre><html>'+h.replace(/(\n|\r)+/g,'\n')+'</html></pre>';}void(disp(h));
I save the code as a bookmark in firefox. So after loading the webpage, when I select the code from the bookmark, and it shows the source code.
Now I am trying to save an html file using python.
from BeautifulSoup import BeautifulSoup from BeautifulSoup import BeautifulStoneSoup import BeautifulSoup import urllib2 from BeautifulSoup import BeautifulSoup page = urllib2.urlopen("http://www.doctorisin.net/") soup = BeautifulSoup(page) print soup.prettify() fp = open('file.txt','wb') fp.write(soup.prettify())
But it does not have all the content that javascript code has. The saved file and the javascript source file do not match. Maybe the python code is not getting all the code (javascript / css tag code) from the html page. What is the problem? Am I doing something wrong? Need help
Thank you
EDITED
As an example of my problem, http://phpjunkyard.com/tutorials/cut-paste-code.php (random site) Go to this site, right-click and select the source of the browsing page (firefox) copies the source code and saves it in text file. Now save the page (save the page as). You can see that both of them are not the same. A saved page (save as) has something more. Python provides output as source code (view page source). It lacks some scripts, forms, etc.
source share