Logging into websites is programmatically difficult and closely related to how the site performs its login process. The reason your code does not work is because you are not dealing with this in your requests and responses.
Take fif.com , for example. When you enter your username and password, the following email request is sent:
POST https://fif.com/login?task=user.login HTTP/1.1 Host: fif.com Connection: keep-alive Content-Length: 114 Cache-Control: max-age=0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8 Origin: https://fif.com User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2062.103 Safari/537.36 Content-Type: application/x-www-form-urlencoded Referer: https://fif.com/login?return=...== Accept-Encoding: gzip,deflate Accept-Language: en-US,en;q=0.8 Cookie: 34f8f7f621b2b411508c0fd39b2adbb2=gnsbq7hcm3c02aa4sb11h5c87f171mh3; __utma=175527093.69718440.1410315941.1410315941.1410315941.1; __utmb=175527093.12.10.1410315941; __utmc=175527093; __utmz=175527093.1410315941.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); __utmv=175527093.|1=RegisteredUsers=Yes=1 username=...&password=...&return=aHR0cHM6Ly9maWYuY29tLw%3D%3D&9a9bd5b68a7a9e5c3b06ccd9b946ebf9=1
Pay attention to cookies (especially the first one, your session token). Please note that the sent uppercase URL is returned. If the server notifies you, it will not allow you to log in.
HTTP/1.1 400 Bad Request
Or, even worse, 200 responses to the login page with an error message, somewhere inside.
But letโs just pretend that you were able to collect all these magic values โโand pass them to the HttpWebRequest object. The site would not know the difference. And he can answer something like this.
HTTP/1.1 303 See other Server: nginx Date: Wed, 10 Sep 2014 02:29:09 GMT Content-Type: text/html; charset=utf-8 Transfer-Encoding: chunked Connection: keep-alive Location: https://fif.com/
I hope you were expecting this. But if you have made it this far, now you can programmatically disable server requests using your trusted session token and return the expected HTML.
GET https://fif.com/ HTTP/1.1 Host: fif.com Connection: keep-alive Cache-Control: max-age=0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8 User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2062.103 Safari/537.36 Referer: https://fif.com/login?return=aHR0cHM6Ly9maWYuY29tLw== Accept-Encoding: gzip,deflate Accept-Language: en-US,en;q=0.8 Cookie: 34f8f7f621b2b411508c0fd39b2adbb2=gnsbq7hcm3c02aa4sb11h5c87f171mh3; __utma=175527093.69718440.1410315941.1410315941.1410315941.1; __utmb=175527093.12.10.1410315941; __utmc=175527093; __utmz=175527093.1410315941.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); __utmv=175527093.|1=RegisteredUsers=Yes=1
And that's all for fif.com - this juggling of cookies and tokens and redirects will be completely different for another site. In my experience (with this site in particular) you have three options to get through the entrance wall.
- Write an incredibly complex and fragile script to dance around site procedures.
- Manually enter the site with your browser, take the magic values โโand connect them to the request objects or
- Create a script to automate selenium to do this for you.
Selenium can handle all the juggling, and in the end you can pull out cookies and turn off your requests normally. Here is an example for fif:
//Run selenium ChromeDriver cd = new ChromeDriver(@"chromedriver_win32"); cd.Url = @"https://fif.com/login"; cd.Navigate(); IWebElement e = cd.FindElementById("username"); e.SendKeys("..."); e = cd.FindElementById("password"); e.SendKeys("..."); e = cd.FindElementByXPath(@"//*[@id=""main""]/div/div/div[2]/table/tbody/tr/td[1]/div/form/fieldset/table/tbody/tr[6]/td/button"); e.Click(); //Get the cookies foreach(OpenQA.Selenium.Cookie c in cd.Manage().Cookies.AllCookies) { string name = c.Name; string value = c.Value; cc.Add(new System.Net.Cookie(name,value,c.Path,c.Domain)); } //Fire off the request HttpWebRequest hwr = (HttpWebRequest) HttpWebRequest.Create("https://fif.com/components/com_fif/tools/capacity/values/"); hwr.CookieContainer = cc; hwr.Method = "POST"; hwr.ContentType = "application/x-www-form-urlencoded"; StreamWriter swr = new StreamWriter(hwr.GetRequestStream()); swr.Write("feeds=35"); swr.Close(); WebResponse wr = hwr.GetResponse(); string s = new System.IO.StreamReader(wr.GetResponseStream()).ReadToEnd();