2011-01-13 51 views
1

我有這樣的代碼:功能做文字自動轉換

- (void)parser:(NSXMLParser *)parser foundCDATA:(NSData *)CDATABlock 
{ 
    NSString *someString = [[NSString alloc] initWithData:CDATABlock encoding:NSUTF8StringEncoding]; 


    someString = [ someString stringByReplacingOccurrencesOfString:@"%" withString: @"&" ]; 
    someString = [ someString stringByReplacingOccurrencesOfString:@"|" withString: @"|" ]; 
    someString = [ someString stringByReplacingOccurrencesOfString:@" " withString: @" " ]; 
    someString = [ someString stringByReplacingOccurrencesOfString:@"–" withString:@"-"]; 
    someString = [ someString stringByReplacingOccurrencesOfString:@"—" withString:@"——"]; 
    someString = [ someString stringByReplacingOccurrencesOfString:@"‘" withString:@"'" ]; 
    someString = [ someString stringByReplacingOccurrencesOfString:@"’" withString:@"'" ]; 
    someString = [ someString stringByReplacingOccurrencesOfString:@"‚" withString:@"," ]; 
    someString = [ someString stringByReplacingOccurrencesOfString:@"“" withString:@"\"" ]; 
    someString = [ someString stringByReplacingOccurrencesOfString:@"”" withString:@"\"" ]; 
    someString = [ someString stringByReplacingOccurrencesOfString:@"…" withString:@"..."]; 
    someString = [ someString stringByReplacingOccurrencesOfString:@"&#38;" withString:@"<"]; 
    someString = [ someString stringByReplacingOccurrencesOfString:@"&#39;" withString:@">"]; 
    someString = [ someString stringByReplacingOccurrencesOfString:@"&#8364;" withString:@"€"]; 
    someString = [ someString stringByReplacingOccurrencesOfString:@"&#8594;" withString:@"→"]; 

    if(nil != self.currentItemValue){ 
     [self.currentItemValue appendString:someString]; 
    } 
} 

有一個函數來自動完成這一角色的轉換?

+3

你也可以通過提供一些答案來加熱。 – Abizern 2011-01-13 20:08:44

回答

2

而不是像這樣硬編碼替換,有一個更好的方法。

這些實體的形式爲:&# +十進制數字+ ;。十進制數位是該字符的unicode代碼點的基本版本。因此,您可以使用此格式搜索子字符串,提取數字並將其直接轉換爲字符。

這裏有一個辦法做到這一點,利用RegexKitLite找到字符串:

NSString * source = @"&#38; &#39; &#124; &#160; &#8211; &#8212; &#8216; &#8217; &#8218; &#8220; &#8221; &#8230; &#8364; &#8594;"; 

NSString * regex = @"&#(\\d+);"; 
NSArray * matches = [source arrayOfCaptureComponentsMatchedByRegex:regex]; 

NSMutableString * decodedSource = [source mutableCopy]; 
for (NSArray * match in matches) { 
    NSString * fullMatch = [match objectAtIndex:0]; 
    NSString * decimalCode = [match objectAtIndex:1]; 

    unichar character = (unichar)[decimalCode intValue]; 
    NSString * replacement = [NSString stringWithFormat:@"%C", character]; 

    [decodedSource replaceOccurrencesOfString:fullMatch withString:replacement options:NSLiteralSearch range:NSMakeRange(0, [decodedSource length])]; 
} 

NSLog(@"decoded: %@", decodedSource); 
[decodedSource release]; 

在我的機器,這個記錄:

decoded: & ' |   – — ‘ ’ ‚ 「 」 … € → 

這不是最有效的方法(這是最糟糕的案例一O(nm)算法),但它是一個開始。 :)

2

哇,這是非常糟糕的,以及效率低下。至少,請切換到使用NSMutableString並進行內聯替換。

在任何情況下,您都可以一次完成此操作,但您必須親自編寫代碼。您可以使用NSScanner或類似-rangeOfString:options:range:的方法來找到每個連續的實體,然後自己找出它的替換。如果您使用的是NSMutableString,則可以用其替換替換該實體,並繼續搜索(在修改您的位置(在NSScanner的情況下)或適當範圍以適應實體和替換字符之間的長度差異) 。