2012-04-10 71 views
2

獲取IMG SRC我是新來的C#和Windows Phone開發所以請原諒我,如果我錯過了很明顯的:從XML CDATA

我想從位於http://blog.dota2.com/feed/一個RSS XML供稿中顯示的縮略圖。該圖像位於用HTML編寫的CDATA標籤內。下面是XML代碼:

<content:encoded> 
<![CDATA[ 
<p>We celebrate Happy Bear Pun Week a day earlier as Lone Druid joins Dota 2&#8242;s cast of heroes.</p> <p><a href="http://media.steampowered.com/apps/dota2/posts/LoneDruid_full.jpg "><img class="alignnone" title="The irony is that he's allergic to fur." src="http://media.steampowered.com/apps/dota2/posts/LoneDruid_small.jpg" alt="The irony is that he's allergic to fur." width="551" height="223" /></a></p> <p>Community things:</p> <ul> <li><a href="http://www.itsgosu.com/game/dota2/articles/ig-monthly-madness-invitational-finals-mar-29_407" target="_blank">It&#8217;s Gosu&#8217;s Monthly Madness</a> tournament finals are tomorrow, March 29th. You don&#8217;t want to miss this, we hear it could be more than we can bear.</li> <li>Bear witness to <a href="http://www.team-dignitas.net/articles/blogs/DotA/1092/Dota-2-Ultimate-Guide-to-Warding/" target="_blank">Team Dignitas&#8217; Ultimate Guide to Warding</a>. This should be required teaching in clawsrooms across the globe.</li> <li>Great Explorer Nullf has <a href="http://nullf.deviantart.com/#/d4ubxiu" target="_blank">compiled the eating habits</a> of the legendary Tidehunter in one handy chart. This might give you paws before deciding to head to the beach.</li> </ul> <p>Bear in mind that there will not be an update next week as we will be hibernating during that time.</p> <p>Today&#8217;s bearlog is available <a href="http://store.steampowered.com/news/7662" target="_blank">here</a>.</p> <p>&nbsp;</p> <p>Bear.</p> 
]]> 
</content:encoded> 

我需要的只是 <img src="http://media.steampowered.com/apps/dota2/posts/LoneDruid_small.jpg" /> 這樣我就可以使用URL在我的閱讀器應用程序來顯示圖像。

我聽說別人說不要使用正則表達式,因爲它是解析HTML的不好的做法。我創建這個作爲概念證明,並且不需要擔心這個。我正在尋找最快的方式來獲取圖片的這個URL,然後在我的應用程序中調用這個。

有沒有人有任何幫助?當你 由於提前, 湯姆

+0

返回的html只是解析xml,它是如此簡單! – scibuff 2012-04-10 13:37:32

+0

任何建議爲最好的方式來做到這一點? :) – 2012-04-10 13:38:36

+0

如果你堅持正則表達式試試這個 - http://stackoverflow.com/questions/4257359/regular-expression-to-get-the-src-of-images-in-c-sharp – scibuff 2012-04-10 13:41:17

回答

1

假設你的XML看起來像這樣(我敢肯定,事實並非如此),而這些擴展:http://searisen.com/xmllib/extensions.wiki

<?xml version="1.0" encoding="utf-8"?> 
<root xmlns:content='uuid:BDC6E3F0-6DA3-11d1-A2A3-00AA00C14882'> 
    <content:encoded> 
    <![CDATA[ 
<p>We celebrate ...</p> 
<p> 
    <a href="http://media.steampowered.com/apps/dota2/posts/LoneDruid_full.jpg "> 
    <img class="alignnone" title="The irony is that he's allergic to fur." 
     src="http://media.steampowered.com/apps/dota2/posts/LoneDruid_small.jpg" /> 
    </a> 
</p> 
<p>the rest removed</p> 
]]> 
    </content:encoded> 
</root> 

這將讓從第二段的圖像源 - 硬編碼的,醜陋的,但它是所有你需要你說的。您將不得不爲path/to/content:encoded指定路徑,如果它在一個組(又名數組)中,那麼它將變得更加複雜。從我的代碼,你可以看到如何分離出陣列(見第):

XElement root = XElement.Load(file) // or .Parse(string) 
string html = root.Get("content:encoded", string.Empty).Replace("&nbsp", " "); 
XElement xdata = XElement.Parse(string.Format("<root>{0}</root>", html)); 
XElement[] paras = xdata.GetElements("p").ToArray(); 
string src = paras[1].Get("a/img/src", string.Empty); 

PS這個工程,因爲HTML格式正確,如果不是,那麼你就必須使用HtmlAgilityPack和其他人一樣回答。你可以使用從Get("content:emcoded" ...)

1

你可以試試這個準備使用HtmlAgilityPack

HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument(); 
doc.LoadHtml(yourstring); 
var imgLinks = doc.DocumentNode 
    .Descendants("img") 
    .Select(n => n.Attributes["src"].Value) 
    .ToArray(); 
0
const string pattern = @"<img.+?src.*?\=.*?""(<?URL>.*?)"""; 
Regex regex = new Regex(pattern, RegexOptions.IgnoreCase); 
var match = regex.Match(myCDataText); 
var domain = match.Groups["URL"].Value;