2013-05-16 100 views
2

我試圖創建類似於隨機詞語放入其中的句子。具體而言,我有這樣的:Python - 分詞,替換詞

"The weather today is [weather_state]." 

,並能夠執行類似的發現在方括號[]所有標記,比從字典或列表,讓我換他們隨機配對用:

"The weather today is warm." 
"The weather today is bad." 

"The weather today is mildly suiting for my old bones." 

記住的是,[托架]標記的位置不會總是以相同的位置,並且將有多個括號噸okens在我的字符串,如:

"[person] is feeling really [how] today, so he's not going [where]." 

我真的不知道從哪裏開始的這或者是這甚至使用記號化或令牌模塊,這個最佳的解決方案。任何暗示將指向我正確的方向非常讚賞!

編輯:只是爲了澄清,我並不需要使用方括號,任何非標準字符都可以。

+0

可能是一個愚蠢的建議,但你看着字符串格式化'{}單曲? – akaIDIOT

回答

4

你正在尋找一個回調函數應用re.sub:

words = { 
    'person': ['you', 'me'], 
    'how': ['fine', 'stupid'], 
    'where': ['away', 'out'] 
} 

import re, random 

def random_str(m): 
    return random.choice(words[m.group(1)]) 


text = "[person] is feeling really [how] today, so he's not going [where]." 
print re.sub(r'\[(.+?)\]', random_str, text) 

#me is feeling really stupid today, so he's not going away. 

注意與format方法,這使得佔位符的更復雜的處理,例如

[person:upper] got $[amount if amount else 0] etc 

基本上,你可以在此基礎之上構建自己的「模板引擎」。

+0

這很棒,我喜歡我如何清潔和高效。它可以工作,併成爲一名Python初學者,理解它給了我一個優勢。 :)聰明的事情是寫一個字典文件,將它保存在光盤上,並將其加載到這裏的「字詞」字典中......字典文件語法如何在文件中看起來像任何指針?非常感謝! – bitworks

+0

@bitworks:最簡單和最方便的選擇是json:http://docs.python.org/2/library/json.html – georg

2

您可以使用format方法。

>>> a = 'The weather today is {weather_state}.' 
>>> a.format(weather_state = 'awesome') 
'The weather today is awesome.' 
>>> 

另外:

>>> b = '{person} is feeling really {how} today, so he\'s not going {where}.' 
>>> b.format(person = 'Alegen', how = 'wacky', where = 'to work') 
"Alegen is feeling really wacky today, so he's not going to work." 
>>> 

當然,這種方法只適用IF你可以從方括號來捲曲那些切換。

0

如果您使用大括號而不是括號,那麼您的字符串可以用作string formatting template。你可以使用itertools.product大量換人與填充:

import itertools as IT 

text = "{person} is feeling really {how} today, so he's not going {where}." 
persons = ['Buster', 'Arthur'] 
hows = ['hungry', 'sleepy'] 
wheres = ['camping', 'biking'] 

for person, how, where in IT.product(persons, hows, wheres): 
    print(text.format(person=person, how=how, where=where)) 

產生

Buster is feeling really hungry today, so he's not going camping. 
Buster is feeling really hungry today, so he's not going biking. 
Buster is feeling really sleepy today, so he's not going camping. 
Buster is feeling really sleepy today, so he's not going biking. 
Arthur is feeling really hungry today, so he's not going camping. 
Arthur is feeling really hungry today, so he's not going biking. 
Arthur is feeling really sleepy today, so he's not going camping. 
Arthur is feeling really sleepy today, so he's not going biking. 

生成隨機的句子,你可以使用random.choice

for i in range(5): 
    person = random.choice(persons) 
    how = random.choice(hows) 
    where = random.choice(wheres) 
    print(text.format(person=person, how=how, where=where)) 

如果必須使用括號在您的格式沒有大括號,你 可以取代用大括號括號,然後執行上述操作:

text = "[person] is feeling really [how] today, so he's not going [where]." 
text = text.replace('[','{').replace(']','}') 
+0

這個'person = person,how = how,where = where'thing could get really stupid如果他們有數百個。 – georg

+0

我決定遠離'format(** locals())'這裏,因爲它不能清楚地說明替換是如何進行的。但是,如果你確實有數百個變量,'format(** locals())'就是要走的路。 – unutbu