2011-04-07 52 views
0

我只想從使用Objective-c的「discription」標記中進行iPhone編程;如何在Objective-c/xcode中解析RSS XML時忽略額外的html標籤?

尼泊爾的政府和私營部門都沒有將數據和應用程序的遠程備份保存在一個地點的災難發生後可以安全的距離內。認證首席拉詹拉吉潘塔的控制器辦公室警告說,隨着...

<description> 
<table border="0" cellpadding="2" cellspacing="7" style="vertical-align:top;"> 
<tr> 
<td width="80" align="center" valign="top"> 
<font style="font-size:85%;font-family:arial,sans-serif"></font></td> 
<td valign="top" class="j"> 
<font style="font-size:85%;font-family:arial,sans-serif"> 
<br /> 
<div style="padding-top:0.8em;"> 
<img alt="" height="1" width="1" /></div> 
<div class="lh"> 
<a href="http://news.google.com/news/url?sa=t&amp;fd=R&amp;usg=AFQjCNG5gNh3aGY3uxIlUjnsJ_C4ugrnrg&amp;url=http://www.thehimalayantimes.com/fullNews.php?headline%3DJapan%2Bquake%2Ba%2Bwake-up%2Bcall%2Bfor%2BNepal%2BIT%2Bsector%26NewsID%3D280789"> 
<b>Japan quake a wake-up call for 
<b>Nepal</b> IT sector</b></a> 
<br /> 
<font size="-1"> 
<b> 
<font color="#6f6f6f">Himalayan Times</font></b></font> 
<br /> 
<font size="-1">Neither the government nor private sector in 
<b>Nepal</b> has off-site backup of data and applications at a distance that can be safe after a disaster at one 
<b>location</b>. Office of the Controller of Certification chief Rajan Raj Panta warned that as the 
<b>...</b></font> 
<br /> 
<font size="-1" class="p"></font> 
<br /> 
<font class="p" size="-1"> 
<a class="p" href="http://news.google.com/news/more?pz=1&amp;ned=uk&amp;ncl=dxKbHaltcQfMZ4M"> 
<nobr> 
<b></b></nobr></a></font></div></font></td></tr></table> 
</description> 

請幫我我怎麼忽略所有那些不需要的HTML標籤和文本?

其實我正在使用谷歌新聞搜索rss,像這樣:http://news.google.com/news?q=location:london&output=rss 是否有任何其他方式獲取基於位置的rss消息?

+0

你怎麼解析? – 2011-04-07 12:45:19

+0

我正在使用NSXMLParser並根據標籤(標題,說明)檢索內容。但我不知道如何避免這些html標籤內的描述標籤。我正在使用谷歌新聞搜索RSS獲取消息。例如此rss http://news.google.com/news?q=location:london&output=rss – Himalay 2011-04-07 12:54:37

回答

1

所以,你已經完成了原始XML的一個解析,讓你的一切的標籤內的文本(這是在原來的逃脫,所以第一個解析會不會看着得很深),但他們重新發送HTML格式的RSS提要,你想純文本?比如說,提取大小爲-1的標籤中的所有文本是否可以接受?如果是的話是這樣的可能就夠:

// relevant class members are: 
BOOL acceptText; 
NSMutableString *totalText; 

// when a new element starts, check if it's a 'font' tag, and if so, 
// decide whether to accept subsequent text based on its size 
- (void)parser:(NSXMLParser *)parser didStartElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qualifiedName attributes:(NSDictionary *)attributeDict 
{ 
    if([elementName isEqualToString:@"font"]) 
    { 
     acceptText = [[attributeDict objectForKey:@"size"] intValue] == -1; 
    } 
} 

// upon receiving new characters, copy them into the string only if 
// that's what we're doing right now 
- (void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string 
{ 
    if(acceptText) 
     [totalText appendString:string]; 
} 

這是一個有點髒修復,要考慮屏幕充其量刮。只需要他們改變他們的HTML佈局,你的刮刮就會破裂。