I am trying to clear a page on a website where a login is required and I get 403 error.
I changed the code from these 2 posts for my site, Using rvest or httr to login to non-standard forms on a web page and how to reuse a session to avoid re-logging in when cleaning with rvest?
library(rvest) pgsession <- html_session("https://www.optionslam.com/earnings/stocks/MSFT?page=-1") pgform <- html_form(pgsession)[[1]] filled_form <- set_values(pgform, 'username'='user', 'password'='pass') s <- submit_form(pgsession, filled_form)
When the code runs, I get this message:
Submitting with 'NULL' Warning message: In request_POST(session, url = url, body = request$values, encode = request$encode, : Forbidden (HTTP 403).
I also run the code this way, updating user_agent as RS however in the comments I get the same error as above.
library(rvest) library(httr) uastring <- "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.71 Safari/537.36" pgsession <- html_session("https://www.optionslam.com/earnings/stocks/MSFT?page=-1", user_agent(uastring)) pgform <- html_form(pgsession)[[1]] filled_form <- set_values(pgform, 'username'='user', 'password'='pass') s <- submit_form(pgsession, filled_form)
If you pull out a page without logging in, it shows a little data table at the bottom right of the text: "Available events: 65"
After logging in, it will display all 65 events, and the table will be populated, and this is what I want to download. I have all the code needed for this, but I'm stuck only in the login part.
Thank you for your help.