2015-10-17 48 views
0
text = myfile 
biterms = list of bilingual terms 
bigrams = [] 
trans = biterms.split(' > ') 
for it in trans[0].split(', '): 
    for en in trans[1].split(', '): 
     bigrams.append((it, en)) 

此代碼文件中的所有行創建二元語法創建每一行二元語法,但我需要的這裏是每行二元語法獨自一人,即在biterms列出每個項目的bigrams.Can有人可以幫忙嗎?在雙語術語列表

回答

2

您需要遍歷每一行。

biterms = u'''Difensori dei diritti umani, libertà di espressione > Human rights defenders, freedom of expression 
sgomberi forzati, violazioni dei diritti umani > forced evictions, human rights violations'''.splitlines() 
bigrams = [] 
for line in biterms: 
    l = [] 
    trans = line.split(' > ') 
    left = trans[0].split(', ') 
    right = trans[1].split(', ') 
    for i in left: 
     for j in right: 
      l.append((i, j)) 
    bigrams.append(l)   

for g in bigrams: 
    print g 
+0

這給了我所有行的bigrms,但我想要格式:line:its bigrmas;行:其bigrams等 – sss

+0

看到您的編輯代碼 – sss

+0

檢查我的編輯.. –