I am trying to use this web page http://volcano.si.edu/search_eruption.cfm to clear the data. There are two pop-up windows that request data filters. I don’t need the filtered data, so I leave them blank and go to the next page by clicking on “Search for failures”.
However, I noticed that the summary table includes only a small number of columns (5 in total) compared to the total number of columns (24 in total) that it should have. However, all 24 columns will be available if you click the "Load Results in Excel" button and open the downloaded file. This is what I need.
So it looks like this has turned from a rattle exercise (using httr and rvest) into something more complicated. However, I am stumped about how to actually “click” on the “Load Results in Excel” button using R. I assume that I have to use RSelenium, but here is my code trying to use httr with POST in case if there is an easier way that any of you can find a good person. I also tried using gdata, data.table, XML, etc. To no avail, which may simply be the result of a user error.
In addition, it would be useful to know that the download button cannot be right-clicked to display the URL.
url <- "http://volcano.si.edu/search_eruption_results.cfm" searchcriteria <- list( eruption_category = "", country = "" ) mydata <- POST(url, body = "searchcriteria")
Using the Inspector in my browser, I could see that there are two filters: "eruption_category" and "country", and both will be empty, since I do not need any filtered data.
Finally, it seems that the code above will force me to go to a page with a table with 5 columns. However, I still could not clear this table using rvest in the code below (using SelectorGadget to clear only one column). In the end, this part does not matter, because, as I said above, I need all 24 columns, not just these 5. But if you find any errors in what I did below, I would be grateful ,
Eruptions <- mydata %>% read_html() %>% html_nodes(".td8") %>% html_text() Eruptions
Thanks for any help you can provide.