Cdata section parsing

I am trying to read an rss feed using powershell and I cannot extract the cdata section in the feed

Here is a feed snippet (with a few paragraphs abbreviated to save space):

<item rdf:about="http://philadelphia.craigslist.org/ctd/blahblah.html"> <title> <![CDATA[2006 BMW 650I,BLACK/BLACK/SPORT/AUTO ]]> </title> ... <dc:title> <![CDATA[2006 BMW 650I,BLACK/BLACK/SPORT/AUTO ]]> </dc:title> <dc:type>text</dc:type> <dcterms:issued>2011-11-28T22:15:55-05:00</dcterms:issued> </item> 

And a Powershell script:

 $rssFeed = [xml](New-Object System.Net.WebClient).DownloadString('http://philadelphia.craigslist.org/sss/index.rss') foreach ($item in $rssFeed.rdf.item) { $item.title } 

What produces this:

 #cdata-section -------------- 2006 BMW 650I,BLACK/BLACK/SPORT/AUTO 2006 BMW 650I,BLACK/BLACK/SPORT/AUTO 

How to extract a cdata section?

I tried several options like $ item.title. "# cdata-section" and $ item.title.InnerText, which return nothing. I tried $ item.title | gm and I see the # cdata section specified as a property. What am I missing?

Thanks.

+4
source share
1 answer

Since you have several of them, the title property itself will be an array, so the following should work:

 $rss.item.title | select -expand "#cdata-section" 

or

 $rss.item.title[0]."#cdata-section" 

based on what you need.

+5
source

Source: https://habr.com/ru/post/1383555/


All Articles