Reading wordpress RSS using C #.

I am trying to read wordpress generated RSS with full text. In firefox and IE9, the element data contains a content:encoded element:

 <content:encoded><![CDATA[bla bla bla]]></content:encoded> 

but when in c # program I request the same rss-url, this node is missing. I am making my C # request as follows:

  WebClient client = new WebClient(); client.Encoding = Encoding.UTF8; client.Headers.Add("Accept", "application/xml"); var xml = client.DownloadString(url) 

Do I have to add a title for the request in order to have this specific field?

0
source share
2 answers

You do not need WebClient to download rss.

 XDocument wp = XDocument.Load("http://wordpress.org/news/feed/"); XNamespace ns = XNamespace.Get("http://purl.org/rss/1.0/modules/content/"); foreach (var content in wp.Descendants(ns + "encoded")) { Console.WriteLine(System.Net.WebUtility.HtmlDecode(content.Value)+"\n\n"); } 

EDIT

The problem is with compression. If the client does not support compression, the server does not send content.

 WebClient web = new WebClient(); web.Headers["Accept-Encoding"] = "gzip,deflate,sdch"; var zip = new System.IO.Compression.GZipStream( web.OpenRead("http://www.whiskymag.fr/feed/?post_type=sortir"), System.IO.Compression.CompressionMode.Decompress); string rss = new StreamReader(zip, Encoding.UTF8).ReadToEnd(); 
+5
source

I assume Wordpress is choosing the β€œwrong” output format based on your Accept header. Which channel is used is defined in /wp-content/feed.php :

 $types = array( 'rss' => 'application/rss+xml', 'rss2' => 'application/rss+xml', 'rss-http' => 'text/xml', 'atom' => 'application/atom+xml', 'rdf' => 'application/rdf+xml' ); 

so instead of text/xml try to accept application/rss+xml .

0
source

Source: https://habr.com/ru/post/1397090/


All Articles