使用python進行文本搜索

我正在處理文本搜索項目，並使用文本blob從文本中搜索句子。 TextBlob有效地使用關鍵字拉取所有句子。然而，對於有效的研究，我也想提出一個句子，之後我無法確定。使用python進行文本搜索

下面是我使用的代碼：

def extraxt_sents(Text,word): 
    search_words = set(word.split(',')) 
     sents = ''.join([s.lower() for s in Text]) 
     blob = TextBlob(sents) 
    matches = [str(s) for s in blob.sentences if search_words & set(s.words)] 
    print search_words 
    print(matches)

來源

2014-07-21 Raghav Shaligram

你的代碼中是否有一些縮進錯誤？ –

我建議，看看'nltk' – cengizkrbck

@cengizkrbck TextBlob似乎比nltk工作得更好。我一個，一個不能在前一個和後一個數字中找出一個句子。 –

如果你想之前得到的線條和比賽結束後，你可以創建一個循環，並記住前行，或使用slices，像[from:to]在blob.sentences列表中。

最好的方法可能是使用enumerate bultin函數。

match_region = [map(str, blob.sentences[i-1:i+2])  # from prev to after next 
       for i, s in enumerate(blob.sentences) # i is index, e is element 
       if search_words & set(s.words)]  # same as your condition

這裏，blob.sentences[i-1:i+2]將提取的子表從指數i-1（含）跨越到指數i+2（獨家），和map輪流在此列表爲字符串的元素。

注意：其實，你可能想用max(0, i-1)代替i-1;否則i-1可能是-1，Python會將其解釋爲最後一個元素，產生一個空片段。另一方面，如果i+2高於列表的長度，則這不會成爲問題。

來源

2014-07-21 14:33:53

使用python進行文本搜索

回答

相關問題