如何解析N參數組成的字符串,並隨機排序,如:如何解析字符串在Python
{ UserID : 36875; tabName : QuickAndEasy}
{ RecipeID : 1150; UserID : 36716}
{ isFromLabel : 0; UserID : 36716; type : recipe; searchWord : soup}
{ UserID : 36716; tabName : QuickAndEasy}
最終我期待輸出中單獨列參數表。
如何解析N參數組成的字符串,並隨機排序,如:如何解析字符串在Python
{ UserID : 36875; tabName : QuickAndEasy}
{ RecipeID : 1150; UserID : 36716}
{ isFromLabel : 0; UserID : 36716; type : recipe; searchWord : soup}
{ UserID : 36716; tabName : QuickAndEasy}
最終我期待輸出中單獨列參數表。
在你的例子([^{}\s:]+)\s*:\s*([^{}\s;]+)
工作正則表達式。你需要知道,不過,所有的比賽將是字符串,所以如果你想存儲36875
爲數字,你需要做一些額外的處理。
import re
regex = re.compile(
r"""( # Match and capture in group 1:
[^{}\s:]+ # One or more characters except braces, whitespace or :
) # End of group 1
\s*:\s* # Match a colon, optionally surrounded by whitespace
( # Match and capture in group 2:
[^{}\s;]+ # One or more characters except braces, whitespace or ;
) # End of group 2""",
re.VERBOSE)
然後,您可以做
>>> dict(regex.findall("{ isFromLabel : 0; UserID : 36716; type : recipe; searchWord : soup}"))
{'UserID': '36716', 'isFromLabel': '0', 'searchWord': 'soup', 'type': 'recipe'}
謝謝!我真的需要用正則表達式來啓動和運行。弄清楚你的代碼是一個很棒的練習。 – mmarboeuf 2014-12-10 03:36:28
lines = "{ UserID : 36875; tabName : QuickAndEasy } ", \
"{ RecipeID : 1150; UserID : 36716}", \
"{ isFromLabel : 0; UserID : 36716; type : recipe; searchWord : soup}" , \
"{ UserID : 36716; tabName : QuickAndEasy}"
counter = 0
mappedLines = {}
for line in lines:
counter = counter + 1
lineDict = {}
line = line.replace("{","")
line = line.replace("}","")
line = line.strip()
fieldPairs = line.split(";")
for pair in fieldPairs:
fields = pair.split(":")
key = fields[0].strip()
value = fields[1].strip()
lineDict[key] = value
mappedLines[counter] = lineDict
def printField(key, lineSets, comma_desired = True):
if key in lineSets:
print(lineSets[key],end="")
if comma_desired:
print(",",end="")
else:
print()
for key in range(1,len(mappedLines) + 1):
lineSets = mappedLines[key]
printField("UserID",lineSets)
printField("tabName",lineSets)
printField("RecipeID",lineSets)
printField("type",lineSets)
printField("searchWord",lineSets)
printField("isFromLabel",lineSets,False)
CSV輸出:上述
36875,QuickAndEasy,,,,
36716,,1150,,,
36716,,,recipe,soup,0
36716,QuickAndEasy,,,,
的代碼是Python的3.4。你可以用2.7代替函數和最後一個for循環得到類似的輸出:
def printFields(keys, lineSets):
output_line = ""
for key in keys:
if key in lineSets:
output_line = output_line + lineSets[key] + ","
else:
output_line += ","
print output_line[0:len(output_line) - 1]
fields = ["UserID", "tabName", "RecipeID", "type", "searchWord", "isFromLabel"]
for key in range(1,len(mappedLines) + 1):
lineSets = mappedLines[key]
printFields(fields,lineSets)
你有多遠?你遇到了什麼問題? – khelwood 2014-12-04 07:53:19
這應該是微不足道的正則表達式,如果你能提供的正則表達式將需要實施更具體的規則。例如,什麼類型的字符被允許作爲鍵/值?值中是否可以有空格?如果是這樣,價值會被引用嗎?如果是這樣,那麼這樣的值是否會有逃脫的引號?等...... – 2014-12-04 07:54:13
感謝您的回覆。沒有太多,因爲我只能得到一個選定的參數,排他的人。鍵和值是字符串,任何字符,最多15個字符。沒有其他規則。 – mmarboeuf 2014-12-04 08:09:25