Splinter or Selenium: can we get the current html page after clicking the button?

Question

Splinter or Selenium: can we get the current html page after clicking the button?

I am trying to crawl the site " http://everydayhealth.com ". However, I found that the page will be dynamically displayed. Therefore, when I click the "Advanced" button, new news will be shown. However, using a shard to click a button prevents browser.html from automatically changing the current html content. Is there a way to let him get the latest html source using either a shard or selenium? My fragment code looks like this:

import requests from bs4 import BeautifulSoup from splinter import Browser browser = Browser() browser.visit('http://everydayhealth.com') browser.click_link_by_text("More") print(browser.html)

Based on @Louis answer, I rewrote the program as follows:

 from selenium import webdriver from selenium.webdriver.support.ui import WebDriverWait driver = webdriver.Firefox() driver.get("http://www.everydayhealth.com") more_xpath = '//a[@class="btn-more"]' more_btn = WebDriverWait(driver, 10).until(lambda driver: driver.find_element_by_xpath(more_xpath)) more_btn.click() more_news_xpath = '(//a[@href="http://www.everydayhealth.com/recipe-rehab/5-herbs-and-spices-to-intensify-flavor.aspx"])[2]' WebDriverWait(driver, 5).until(lambda driver: driver.find_element_by_xpath(more_news_xpath)) print(driver.execute_script("return document.documentElement.outerHTML;")) driver.quit()

However, in the output text, I still could not find the text on the updated page. For example, when I search for “The milk of your friend or enemy?”, He still returns nothing. What is the problem?

+6

python html selenium web-crawler splinter

xjmfel Nov 07 '14 at 20:57

source share

2 answers

Louis · Answer 1 · 2014-11-08T17:24:54+0000

With Selenium, assuming driver is your initialized WebDriver object, this will give you HTML that matches the DOM state at the time of the call:

 driver.execute_script("return document.documentElement.outerHTML;")

The return value is a string you could do:

 print(driver.execute_script("return document.documentElement.outerHTML;"))

myersjustinc · Answer 2 · 2014-11-08T15:38:28+0000

When I use Selenium for such tasks, I know that browser.page_source updated.

Splinter or Selenium: can we get the current html page after clicking the button?

More articles: