Python中的「一字一句」語法是什麼意思？

我看到以下gensim tutorial page的腳本片段。Python中的「一字一句」語法是什麼意思？

在下面的Python腳本中是什麼語法的？

>> texts = [[word for word in document.lower().split() if word not in stoplist] 
>>   for document in documents]

來源

2014-01-06 smwikipedia

這是一個list comprehension。您發佈的代碼循環遍歷document.lower.split()中的每個元素，並創建一個僅包含滿足if條件的元素的新列表。它爲documents中的每個文檔執行此操作。

試試吧......

elems = [1, 2, 3, 4] 
squares = [e*e for e in elems] # square each element 
big = [e for e in elems if e > 2] # keep elements bigger than 2

你可以從你的例子看，列表內涵可以被嵌套。

來源

2014-01-06 15:23:37 ChrisP

這是一個list comprehension。一個更簡單的例子可能是：

evens = [num for num in range(100) if num % 2 == 0]

來源

2014-01-06 15:22:30 Doorknob

我很確定我在某些NLP應用程序中看到了這一行。

這個列表解析：

[[word for word in document.lower().split() if word not in stoplist] for document in documents]

相同

ending_list = [] # often known as document stream in NLP. 
for document in documents: # Loop through a list. 
    internal_list = [] # often known as a a list tokens 
    for word in document.lower().split(): 
    if word not in stoplist: 
     internal_list.append(word) # this is where the [[word for word...] ...] appears 
    ending_list.append(internal_list)

基本上你想要包含標記列表的文件清單。因此，通過文件循環，

for document in documents:

你再拆每個文檔分解成記號

list_of_tokens = [] 
    for word in document.lower().split():

，然後使這些標記的列表：

list_of_tokens.append(word)

例如：

>>> doc = "This is a foo bar sentence ." 
>>> [word for word in doc.lower().split()] 
['this', 'is', 'a', 'foo', 'bar', 'sentence', '.']

It's th同樣：

>>> doc = "This is a foo bar sentence ." 
>>> list_of_tokens = [] 
>>> for word in doc.lower().split(): 
... list_of_tokens.append(word) 
... 
>>> list_of_tokens 
['this', 'is', 'a', 'foo', 'bar', 'sentence', '.']

來源

2014-01-06 15:38:25 alvas

謝謝幫助了很多與解釋... –

很高興答案幫助=） – alvas

Python中的「一字一句」語法是什麼意思？

回答

相關問題