2014-04-16 17 views
1

鑑於我已經用換行符分隔了標記句子,並且我有2列代表標記的實際標記和預測標記。我想遍歷這些標記並找出錯誤的預測,例如實際標記不等於預測標記通過換行符分隔讀句子並解析

#word actual predicted 

James PERSON PERSON 
Washington PERSON LOCATION  
went O O 
home O LOCATION 

He O O 
took O O 
Elsie PERSON PERSON 
along O O 

>James Washington went home: Incorrect 
>He took Elsie along: Correct 
+0

我知道'itertools.groupby'但我不知道如何應用它在這種情況下 – DevEx

+0

如果任何預測輸出不等於實際輸出,那麼你想打印'不正確'?準確地說是 –

+0

。如果實際值與預測值不相等,則打印不正確,如果不相等,則打印正確 – DevEx

回答

0

除了我previous answer我使用all()和列表理解這裏:

from itertools import groupby 

d = {True: 'Correct', False: 'Incorrect'} 
with open('text1.txt') as f: 
    for k, g in groupby(f, key=str.isspace): 
     if not k: 
      # Split each line in the current group at whitespaces 
      data = [line.split() for line in g] 
      # If for each line the second column is equal to third then `all()` will 
      # return True. 
      predicts_matched = all(line[1] == line[2] for line in data) 
      print ('{}: {}'.format(' '.join(x[0] for x in data), d[predicts_matched])) 

輸出:

James Washington went home: Incorrect 
He took Elsie along: Correct 
+0

非常感謝,效果很好 – DevEx

0

Python字符串具有可用於此處的強大解析函數。我使用Python 3.3做了這個,但它也可以與任何其他版本一起工作。

thistext = '''James PERSON PERSON 
Washington PERSON LOCATION  
went O O 
home O LOCATION 

He O O 
took O O 
Elsie PERSON PERSON 
along O O 
''' 

def check_text(text): 
    lines = text.split('\n') 
    correct = [True] #a bool wrapped in a list,we can modify it from a nested function 
    words = [] 

    def print_result(): 
     if words: 
      print(' '.join(words), ": ", "Correct" if correct[0] else "Incorrect") 
     #words.clear() 
     del words[:]   
     correct[0] = True 

    for line in lines: 
     if line.strip(): # check if the line is empty 
      word, a, b = line.split() 
      if a != b: 
       correct[0] = False 
      words.append(word) 
     else: 
      print_result(); 

    print_result() 

check_text(thistext) 
+0

,謝謝您的使用 – DevEx

相關問題