2016-01-07 30 views
-4

我有以下3句,我在文字搜索,並將它們添加到使用sentence.append()Python中每個單詞的出現次數是一組字符串?

例如

sentence[0]=" hello my name is John" 
sentence[1]="good morning I am John" 
sentence[2]= "hello I am Smith" 

我想要指定給每個句子的得分列表sentence並根據每個字在所有3個句子中每個詞的出現次數。

例如:

Hello score= 2 since it appeared twice **SOLVED** 
sentence[0] score= hello score(which is 2) + my (1) + name (1) + is (1) + John(2) = 6 

所以我使用了數句子中每個單詞的出現(得分),我的問題是如何使用它來計算句子的成績?

dict = {} 
for sentenceC in sentence: 
    for word in re.split('\s', sentenceC): # split with whitespace 
     try: 
      dict[word] += 1 
     except KeyError: 
      dict[word] = 1 
print (dict) 
+0

所以......問題是什麼? – jonrsharpe

+0

對不起,這是不完整的,但我完成了它,我想知道如何使用字數來計算句子分數 –

+1

我明白,「你好」的得分是2,但我不明白你是什麼想爲''語句[0]得分'「 –

回答

0

問題分解成子任務

def getWordScores(sentences): 
    scores = {} 
    for sentence in sentences: 
     for word in sentence.strip().split(): 
      word = word.lower() 
      scores[word] = scores.get(word,0) + 1 
    return scores 

def getSentenceScore(sentence, word_scores): 
    return sum(word_scores.get(w.lower(), 0) for w in sentence.strip().split()) 

然後撰寫的任務得到解決

word_scores = getWordScores(sentence) 
print word_scores['hello'] 
print getSentenceScore(sentence[0], word_scores) 
0

你可以得到你的分數是這樣的:

import re 

sentence = list() 

sentence.append(" hello my name is John") 
sentence.append("good morning I am John") 
sentence.append("hello I am Smith") 

value = dict() 
for sentenceC in sentence: 
    for word in sentenceC.strip().split(" "): # split with whitespace 
     try: 
      value[word.lower()] += 1 
     except KeyError: 
      value[word.lower()] = 1 
print (value) 

score = dict() 
number = 1 
for sentenceC in sentence: 
    for word in sentenceC.strip().split(" "): # split with whitespace 
     try: 
      score[number] += value[word.lower()] 
     except KeyError: 
      score[number] = value[word.lower()] 
    number += 1 

print score 

#output: {1: 7, 2: 8, 3: 7} 
相關問題