2012-12-12 70 views
1

我需要用西班牙文解析xml文件(如果我沒有控制如何生成)。在解析部分工作得很好,但問題是,當XML文件,例如特殊字符:用西班牙語解析xml文件(非UTF-8格式)

Espectáculos

時是分析我得到這個: áculos

我使用CocoaXMLParser。你們中的任何人都知道如何處理?

這是我的代碼:

-(void)getRss 
{ 
    NSString *[email protected]"http://mysite.com/content.xml"; 
    NSURL *url=[NSURL URLWithString:urlString]; 
    NSURLRequest *rssRequest=[NSURLRequest requestWithURL:url]; 
    self.contentConnection=[[NSURLConnection alloc]initWithRequest:rssRequest delegate:self startImmediately:YES]; 


} 




- (void)connection:(NSURLConnection *)connection didReceiveResponse:(NSURLResponse *)response { 


    self.dataResponse = [NSMutableData data]; 

    NSLog(@"didReceiveResponse"); 

} 


- (void)connection:(NSURLConnection *)connection didReceiveData:(NSData *)data { 
    [_dataResponse appendData:data]; 

    NSLog(@"didReceiveData"); 




} 
- (void)connection:(NSURLConnection *)connection didFailWithError:(NSError *)error { 

    NSLog(@"didFailWithError"); 

} 



- (void)connectionDidFinishLoading:(NSURLConnection *)connection { 

    NSLog(@"connectionDidFinishLoading "); 

    [self parseContent]; 
} 


-(void)parseContent 
{ 
    NSString *response = [[NSString alloc] initWithData:_dataResponse encoding:NSUTF8StringEncoding]; 
    NSLog(@"data received %@", response); 
    NSLog(@"parse content "); 

    NSXMLParser *parser = [[NSXMLParser alloc] initWithData:_dataResponse]; 
    parser.delegate = self; 
    [parser parse]; 


} 


- (void) parser:(NSXMLParser *)parser foundCharacters:(NSString *)string 
{ 
    self.currentNodeContent = (NSMutableString *) [string stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]]; 
} 

- (void) parser:(NSXMLParser *)parser didStartElement:(NSString *)elementname namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName attributes:(NSDictionary *)attributeDict 
{ 
    if ([elementname isEqualToString:@"categoriaNoticias"]) 
    { 
      self.validXML=YES; 
     NSLog(@"es xml valido"); 

    } 
    else 
    { 
     self.validXML=YES; 
    } 
} 

- (void) parser:(NSXMLParser *)parser didEndElement:(NSString *)elementname namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName 
{ 
    if (_validXML) { 
     if ([elementname isEqualToString:@"titulo"]) 
     { 
      NSString *string=_currentNodeContent; 
      NSLog(@"titulo %@", string); 
     } 
     if ([elementname isEqualToString:@"link"]) 
     { 
     NSLog(@"titulo %@", _currentNodeContent); 
    } 

} 

}

我會很感激任何指針

+0

按照標準,XML *必須*以UTF-8編碼。 – mvp

+0

@mvp:「XML文檔中的每個外部解析的實體可能使用其字符的不同編碼。」從http://www.w3.org/TR/REC-xml/#charencoding – ckhan

+0

我明白,但正如我所說我沒有控制的XML文件的創建,我需要與我給工作的工作 – Juan

回答

0

假設你的XML文件中的Latin-1(ISO-8859-1)編碼,您在運行中修復XML文件:

- (void)connection:(NSURLConnection *)connection didReceiveResponse:(NSURLResponse *)response { 
    const char* xmlDecl = "<?xml version=\"1.0\" encoding=\"ISO-8859-1\" ?>\r\n"; 
    self.dataResponse = [NSMutableData data]; 
    [self.dataResponse appendBytes: xmlDecl length: strlen(xmlDecl)]; 
} 

請檢查有效編碼是什麼,並適應相應地如果需要的話。

+0

我已添加上面的代碼,但現在我得到了:<?xml version =「1.0」encoding =「ISO-8859-1 「?> <?xml version =」1.0「encoding =」UTF-8「?>並且沒有任何內容被解析。 – Juan

+0

然後你的XML文件不是以上面在註釋中寫的' Codo

+0

@Juan使用這個查找響應的內容類型'[[allHeaderFields] objectForKey:@「Content-Type」]'該值將是'text/html; charset = utf-8'用於UTF-8等。 – Jano