如何在我的NSXMLParser中實現這種方法來提取圖像

我是iOS開發新手，在這一刻我已經實現了NSXMLparser，但我真的不知道如何分離具有相同名稱的標籤，但不同內容，如<description>。在某些feed中，這個標籤只有摘要，其他的包含「img src」，我也想提取它。（有或沒有CDATA）如何在我的NSXMLParser中實現這種方法來提取圖像

Example of description tags wich i need to grab the images and then pass to my UIImageView: 

<description><![CDATA[ <p>Roger Craig Smith and Troy Baker to play Batman and the Joker respectively in upcoming action game; Deathstroke confirmed as playable character. </p><p><img src="http://image.com.com/gamespot/images/2013/139/ArkhamOrigins_29971_thumb.jpg" 

<description>&lt;img src=&quot;http://cdn.gsmarena.com/vv/newsimg/13/05/samsung-galaxy-s4-active-photos/thumb.jpg&quot; width=&quot;70&quot; height=&quot;92&quot; hspace=&quot;3&quot; alt=&quot;&quot; border=&quot;0&quot; align=left style="background:#333333;padding:0px;margin:0px 4px 0px 0px;border-style:solid;border-color:#aaaaaa;border-width:1px" /&gt; &lt;p&gt;

我認爲@Rob example解決了我的情況，但我不知道如何在我的NSXMLParser包括，如下所述，分離數據和圖像。我只能抓取這個解析器上的數據（摘要）。

我的NSXMLParser：

- (void)parser:(NSXMLParser *)parser didStartElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qualifiedName attributes:(NSDictionary *)attributeDict 
{ 
element = [elementName copy]; 


if ([elementName isEqualToString:@"item"]) 
{ 
    elements = [[NSMutableDictionary alloc] init]; 
    title = [[NSMutableString alloc] init]; 
    date = [[NSMutableString alloc] init]; 
    summary = [[NSMutableString alloc] init]; 
    link = [[NSMutableString alloc] init]; 
    img = [[NSMutableString alloc] init]; 
    imageLink = [[NSMutableString alloc]init]; 

} 

if([elementName isEqualToString:@"media:thumbnail"]) { 
    NSLog(@"thumbnails media:thumbnail: %@", attributeDict); 
    imageLink = [attributeDict objectForKey:@"url"]; 
} 

if([elementName isEqualToString:@"media:content"]) { 
    NSLog(@"thumbnails media:content: %@", attributeDict); 
    imageLink = [attributeDict objectForKey:@"url"]; 

} 

if([elementName isEqualToString:@"enclosure"]) { 
    NSLog(@"thumbnails Enclosure %@", attributeDict); 
    imageLink = [attributeDict objectForKey:@"url"]; 
} 

- (void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string 
{ 
if ([element isEqualToString:@"title"]) 
{ 
    [title appendString:string]; 
} 
else if ([element isEqualToString:@"pubDate"]) 
{ 
    [date appendString:string]; 
} 
else if ([element isEqualToString:@"description"]) 
{ 
    [summary appendString:string]; 

} 
    else if ([element isEqualToString:@"media:description"]) 
{ 
    [summary appendString:string]; 

} 
else if ([element isEqualToString:@"link"]) 
{ 
    [link appendString:string]; 
} 
else if ([element isEqualToString:@"url"]) { 

    [imageLink appendString:string]; 
} 
else if ([element isEqualToString:@"src"]) { 

    [imageLink appendString:string]; 
} 
else if ([element isEqualToString:@"content:encoded"]){ 
    NSString *imgString = [self getImage:string]; 
    if (imgString != nil) { 
     [img appendString:imgString]; 
     NSLog(@"Content of img:%@", img); 
    } 

} 

-(NSString *) getImage:(NSString *)htmlString { 
NSString *url = nil; 

NSScanner *theScanner = [NSScanner scannerWithString:htmlString]; 

[theScanner scanUpToString:@"<img" intoString:nil]; 
if (![theScanner isAtEnd]) { 
    [theScanner scanUpToString:@"src" intoString:nil]; 
    NSCharacterSet *charset = [NSCharacterSet characterSetWithCharactersInString:@"\"'"]; 
    [theScanner scanUpToCharactersFromSet:charset intoString:nil]; 
    [theScanner scanCharactersFromSet:charset intoString:nil]; 
    [theScanner scanUpToCharactersFromSet:charset intoString:&url]; 

} 
return url; 
} 

@end

來源

2013-05-20 Edward

在您的例子，你剛纔有兩個description元素，每一個具有嵌入在其中的img標籤。您只需像正常解析description，然後拉出img標籤（使用正則表達式，使用下面的我的retrieveImageSourceTagsViaRegex或掃描儀）。

請注意，如果不需要，您不必處理CDATA和非CDATA轉換。雖然NSXMLParserDelegate提供了一個foundCDATA例程，我實際上傾向於而不是實現。在沒有foundCDATA的情況下，標準foundCharacters例程NSXMLParser將優雅地處理您的description標籤（帶和不帶CDATA）的無縫翻譯。

考慮以下的假設XML：

<xml> 
    <descriptions> 
     <description><![CDATA[ <p>Roger Craig Smith and Troy Baker to play Batman and the Joker respectively in upcoming action game; Deathstroke confirmed as playable character. </p><p><img src="http://image.com.com/gamespot/images/2013/139/ArkhamOrigins_29971_thumb.jpg">]]></description> 
     <description>&lt;img src=&quot;http://cdn.gsmarena.com/vv/newsimg/13/05/samsung-galaxy-s4-active-photos/thumb.jpg&quot; width=&quot;70&quot; height=&quot;92&quot; hspace=&quot;3&quot; alt=&quot;&quot; border=&quot;0&quot; align=left style="background:#333333;padding:0px;margin:0px 4px 0px 0px;border-style:solid;border-color:#aaaaaa;border-width:1px" /&gt; &lt;p&gt;</description> 
    </descriptions> 
</xml>

下面的解析器將解析這兩個description條目，抓住了圖像的URL了出來。正如你所看到的，有沒有特殊處理CDATA需要：

@interface ViewController() <NSXMLParserDelegate> 

@property (nonatomic, strong) NSMutableString *description; 
@property (nonatomic, strong) NSMutableArray *results; 

@end 

@implementation ViewController 

- (void)viewDidLoad 
{ 
    [super viewDidLoad]; 
    // Do any additional setup after loading the view, typically from a nib. 

    NSURL *filename = [[NSBundle mainBundle] URLForResource:@"test" withExtension:@"xml"]; 
    NSXMLParser *parser = [[NSXMLParser alloc] initWithContentsOfURL:filename]; 
    parser.delegate = self; 
    [parser parse]; 

    // full array of dictionary entries 

    NSLog(@"results = %@", self.results); 
} 

- (NSMutableArray *)retrieveImageSourceTagsViaRegex:(NSString *)string 
{ 
    NSError *error = NULL; 
    NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:@"(<img\\s[\\s\\S]*?src\\s*?=\\s*?['\"](.*?)['\"][\\s\\S]*?>)+?" 
                      options:NSRegularExpressionCaseInsensitive 
                      error:&error]; 

    NSMutableArray *results = [NSMutableArray array]; 

    [regex enumerateMatchesInString:string 
          options:0 
           range:NSMakeRange(0, [string length]) 
         usingBlock:^(NSTextCheckingResult *result, NSMatchingFlags flags, BOOL *stop) { 

          [results addObject:[string substringWithRange:[result rangeAtIndex:2]]]; 
         }]; 

    return results; 
} 

#pragma mark - NSXMLParserDelegate 

- (void)parserDidStartDocument:(NSXMLParser *)parser 
{ 
    self.results = [NSMutableArray array]; 
} 

- (void)parser:(NSXMLParser *)parser didStartElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName attributes:(NSDictionary *)attributeDict 
{ 
    if ([elementName isEqualToString:@"description"]) 
     self.description = [NSMutableString string]; 
} 

- (void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string 
{ 
    if (self.description) 
     [self.description appendString:string]; 
} 

- (void)parser:(NSXMLParser *)parser didEndElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName 
{ 
    if ([elementName isEqualToString:@"description"]) 
    { 
     NSArray *imgTags = [self retrieveImageSourceTagsViaRegex:self.description]; 
     NSDictionary *result = @{@"description": self.description, @"imgs" : imgTags}; 
     [self.results addObject:result]; 
     self.description = nil; 
    } 
} 

@end

這會產生以下結果（注意，沒有CDATA）：

results = (
     { 
     description = " <p>Roger Craig Smith and Troy Baker to play Batman and the Joker respectively in upcoming action game; Deathstroke confirmed as playable character. </p><p><img src=\"http://image.com.com/gamespot/images/2013/139/ArkhamOrigins_29971_thumb.jpg\">"; 
     imgs =   (
      "http://image.com.com/gamespot/images/2013/139/ArkhamOrigins_29971_thumb.jpg" 
     ); 
    }, 
     { 
     description = "<img src=\"http://cdn.gsmarena.com/vv/newsimg/13/05/samsung-galaxy-s4-active-photos/thumb.jpg\" width=\"70\" height=\"92\" hspace=\"3\" alt=\"\" border=\"0\" align=left style=\"background:#333333;padding:0px;margin:0px 4px 0px 0px;border-style:solid;border-color:#aaaaaa;border-width:1px\" /> <p>"; 
     imgs =   (
      "http://cdn.gsmarena.com/vv/newsimg/13/05/samsung-galaxy-s4-active-photos/thumb.jpg" 
     ); 
    } 
)

因此，底線，只是解析像普通的XML一樣，不用擔心CDATA，只需使用NSScanner或NSRegularExpression解析出圖像URL即可。

來源

2013-05-21 00:12:22 Rob

我很抱歉沒有足夠清晰，我的意思是說，在一些XML文件中，描述標籤在CDATA中有圖像而其他圖像沒有。我上面的描述標籤示例來自不同的RSS源，而不是一個XML文件，裏面有兩個描述標籤。當我在我的NSXMLParser中實現foundCDATA方法時，顯然它會覆蓋我的摘要，並獲取「img src」圖像，但我需要兩者。請在這裏看到我的解析器[鏈接]（https://dl.dropboxusercontent.com/u/1216970/RSSParser.rtf）謝謝，我真的很感謝你的幫助。 – Edward

@Edward你不必實現'foundCDATA'。如果你不這樣做，標準的'foundCharacters'會自動爲你解析它，從你的CDATA'正確地提取字符（但是不需要'CDATA'開始和結束標記）。特別是如果你有時候混合使用'CDATA'，有時候不需要，只是不要實現'foundCDATA'，'foundCharacters'將會非常優雅地處理。看到我的實施;單個XML文件，一個'description'標籤有一個'CDATA'，另一個沒有，但標準'foundCharacter'完全解析。 – Rob

讓我們把這個聊天：http://chat.stackoverflow.com/rooms/30287/chat-with-edward – Rob

如何在我的NSXMLParser中實現這種方法來提取圖像

回答

相關問題