2010-08-06 160 views
1

如何優化此嵌套for循環?如何優化此嵌套for循環?

程序應該檢查從單詞文本文件創建的數組中的每個單詞,如果它大於8個字符,則將其添加到goodWords數組中。但需要注意的是,我只希望詞根是goodWords數組中,例如:

如果映入眼簾被添加到陣列中,我不想打招呼或問候或迎賓員等

NSString *string = [NSString stringWithContentsOfFile:@"/Users/james/dev/WordParser/word.txt" encoding:NSUTF8StringEncoding error:NULL]; 
    NSArray *words = [string componentsSeparatedByString:@"\r\n"]; 
    NSMutableArray *goodWords = [NSMutableArray array]; 
    BOOL shouldAddToGoodWords = YES; 

    for (NSString *word in words) 
    { 
     NSLog(@"Word: %@", word); 

     if ([word length] > 8) 
     { 
      NSLog(@"Word is greater than 8"); 

      for (NSString *existingWord in [goodWords reverseObjectEnumerator]) 
      { 
       NSLog(@"Existing Word: %@", existingWord); 
       if ([word rangeOfString:existingWord].location != NSNotFound) 
       { 
        NSLog(@"Not adding..."); 
        shouldAddToGoodWords = NO; 
        break; 
       } 
      } 

      if (shouldAddToGoodWords) 
      { 
       NSLog(@"Adding word: %@", word); 
       [goodWords addObject:word]; 
      } 
     } 

     shouldAddToGoodWords = YES; 
    } 

回答

3

怎麼樣的東西升這個嗎?

//load the words from wherever 
NSString * allWords = [NSString stringWithContentsOfFile:@"/usr/share/dict/words"]; 
//create a mutable array of the words 
NSMutableArray * words = [[allWords componentsSeparatedByCharactersInSet:[NSCharacterSet newlineCharacterSet]] mutableCopy]; 
//remove any words that are shorter than 8 characters 
[words filterUsingPredicate:[NSPredicate predicateWithFormat:@"length >= 8"]]; 
//sort the words in ascending order 
[words sortUsingSelector:@selector(caseInsensitiveCompare:)]; 

//create a set of indexes (these will be the non-root words) 
NSMutableIndexSet * badIndexes = [NSMutableIndexSet indexSet]; 
//remember our current root word 
NSString * currentRoot = nil; 
NSUInteger count = [words count]; 
//loop through the words 
for (NSUInteger i = 0; i < count; ++i) { 
    NSString * word = [words objectAtIndex:i]; 
    if (currentRoot == nil) { 
     //base case 
     currentRoot = word; 
    } else if ([word hasPrefix:currentRoot]) { 
     //word is a non-root word. remember this index to remove it later 
     [badIndexes addIndex:i]; 
    } else { 
     //no match. this word is our new root 
     currentRoot = word; 
    } 
} 
//remove the non-root words 
[words removeObjectsAtIndexes:badIndexes]; 
NSLog(@"%@", words); 
[words release]; 

這在我的機器(2.8GHz MBP)上運行速度非常快。

+0

這比我的版本快了50倍,做得很好;) – Jasarien 2010-08-06 20:37:57

+0

@Jasarien你可能想做的不僅僅是'hasPrefix:',因爲'hasPrefix:'是區分大小寫的...... – 2010-08-06 20:54:02

+0

它很好用。整個文件由小寫字母組成,所以它不是問題。 – Jasarien 2010-08-06 23:47:10

2

A Trie似乎適合您的用途。它就像一個散列,對於檢測給定的字符串是否是已經看過的字符串的前綴很有用。

1

我使用了NSSet以確保您一次只添加1個單詞。如果NSSet尚未包含它,它將添加一個單詞。然後它會檢查新單詞是否是已添加的任何單詞的子字符串,如果爲真,則不會添加新單詞。它也是不區分大小寫的。

我寫的是重構你的代碼。這可能沒有那麼快,但如果你想要搜索已經添加到樹中的單詞時,你確實需要一個樹數據結構。

看看RedBlack TreesB-Trees

Words.txt

objective 
objectively 
cappucin 
cappucino 
cappucine 
programme 
programmer 
programmatic 
programmatically 

源代碼

- (void)addRootWords { 

    NSString  *textFile = [[NSBundle mainBundle] pathForResource:@"words" ofType:@"txt"]; 
    NSString  *string = [NSString stringWithContentsOfFile:textFile encoding:NSUTF8StringEncoding error:NULL]; 
    NSArray   *wordFile = [string componentsSeparatedByString:@"\n"]; 
    NSMutableSet *goodWords = [[NSMutableSet alloc] init]; 

    for (NSString *newWord in wordFile) 
    { 
     NSLog(@"Word: %@", newWord); 
     if ([newWord length] > 8) 
     { 
      NSLog(@"Word '%@' contains 8 or more characters", newWord); 
      BOOL shouldAddWord = NO; 
      if ([goodWords containsObject:newWord] == NO) { 
       shouldAddWord = YES; 
      } 
      for (NSString *existingWord in goodWords) 
      { 
       NSRange textRange = [[newWord lowercaseString] rangeOfString:[existingWord lowercaseString]]; 
       if(textRange.location != NSNotFound) { 
        // newWord contains the a substring of existingWord 
        shouldAddWord = NO; 
        break; 
       } 
       NSLog(@"(word:%@) does not contain (substring:%@)", newWord, existingWord); 
       shouldAddWord = YES; 
      } 
      if (shouldAddWord) { 
       NSLog(@"Adding word: %@", newWord); 
       [goodWords addObject:newWord]; 
      } 
     } 
    } 

    NSLog(@"***Added words***"); 
    int count = 1; 
    for (NSString *word in goodWords) { 
     NSLog(@"%d: %@", count, word); 
     count++; 
    } 

    [goodWords release]; 
} 

輸出:

***Added words*** 
1: cappucino 
2: programme 
3: objective 
4: programmatic 
5: cappucine 
+0

當你說「make it work」時,你能解釋一下它有什麼問題嗎?它適用於我,處理170,000個單詞大約需要4分鐘的時間......並且,不要擔心,這不是作業。 – Jasarien 2010-08-06 20:13:03

+0

@Jasarien:當然不用擔心,試試這個。 – 2010-08-06 20:27:31