如何解析字符串在Python

如何解析N參數組成的字符串，並隨機排序，如：如何解析字符串在Python

{ UserID : 36875; tabName : QuickAndEasy} 
{ RecipeID : 1150; UserID : 36716} 
{ isFromLabel : 0; UserID : 36716; type : recipe; searchWord : soup} 
{ UserID : 36716; tabName : QuickAndEasy}

最終我期待輸出中單獨列參數表。

來源

2014-12-04 mmarboeuf

你有多遠？你遇到了什麼問題？ – khelwood 2014-12-04 07:53:19

這應該是微不足道的正則表達式，如果你能提供的正則表達式將需要實施更具體的規則。例如，什麼類型的字符被允許作爲鍵/值？值中是否可以有空格？如果是這樣，價值會被引用嗎？如果是這樣，那麼這樣的值是否會有逃脫的引號？等...... – 2014-12-04 07:54:13

感謝您的回覆。沒有太多，因爲我只能得到一個選定的參數，排他的人。鍵和值是字符串，任何字符，最多15個字符。沒有其他規則。 – mmarboeuf 2014-12-04 08:09:25

在你的例子([^{}\s:]+)\s*:\s*([^{}\s;]+)工作正則表達式。你需要知道，不過，所有的比賽將是字符串，所以如果你想存儲36875爲數字，你需要做一些額外的處理。

import re 
regex = re.compile(
    r"""(  # Match and capture in group 1: 
    [^{}\s:]+ # One or more characters except braces, whitespace or : 
    )   # End of group 1 
    \s*:\s*  # Match a colon, optionally surrounded by whitespace 
    (   # Match and capture in group 2: 
    [^{}\s;]+ # One or more characters except braces, whitespace or ; 
    )   # End of group 2""", 
    re.VERBOSE)

然後，您可以做

>>> dict(regex.findall("{ isFromLabel : 0; UserID : 36716; type : recipe; searchWord : soup}")) 
{'UserID': '36716', 'isFromLabel': '0', 'searchWord': 'soup', 'type': 'recipe'}

測試它live on regex101.com。

來源

2014-12-04 08:39:23

謝謝！我真的需要用正則表達式來啓動和運行。弄清楚你的代碼是一個很棒的練習。 – mmarboeuf 2014-12-10 03:36:28

lines = "{ UserID : 36875; tabName : QuickAndEasy } ", \ 
     "{ RecipeID : 1150; UserID : 36716}", \ 
     "{ isFromLabel : 0; UserID : 36716; type : recipe; searchWord : soup}" , \ 
     "{ UserID : 36716; tabName : QuickAndEasy}" 

counter = 0 

mappedLines = {} 

for line in lines: 
    counter = counter + 1 
    lineDict = {} 
    line = line.replace("{","") 
    line = line.replace("}","") 
    line = line.strip() 
    fieldPairs = line.split(";") 

    for pair in fieldPairs: 
     fields = pair.split(":") 
     key = fields[0].strip() 
     value = fields[1].strip() 
     lineDict[key] = value 

    mappedLines[counter] = lineDict 

def printField(key, lineSets, comma_desired = True): 
    if key in lineSets: 
     print(lineSets[key],end="") 
    if comma_desired: 
     print(",",end="") 
    else: 
     print() 

for key in range(1,len(mappedLines) + 1): 
    lineSets = mappedLines[key] 
    printField("UserID",lineSets) 
    printField("tabName",lineSets) 
    printField("RecipeID",lineSets) 
    printField("type",lineSets) 
    printField("searchWord",lineSets) 
    printField("isFromLabel",lineSets,False)

CSV輸出：上述

36875,QuickAndEasy,,,, 
36716,,1150,,, 
36716,,,recipe,soup,0 
36716,QuickAndEasy,,,,

的代碼是Python的3.4。你可以用2.7代替函數和最後一個for循環得到類似的輸出：

def printFields(keys, lineSets): 
    output_line = "" 
    for key in keys: 
     if key in lineSets: 
      output_line = output_line + lineSets[key] + "," 
     else: 
      output_line += "," 
    print output_line[0:len(output_line) - 1] 

fields = ["UserID", "tabName", "RecipeID", "type", "searchWord", "isFromLabel"] 

for key in range(1,len(mappedLines) + 1): 
    lineSets = mappedLines[key] 
    printFields(fields,lineSets)

來源

2014-12-04 08:40:48 Scooter

你好，非常感謝你的幫助。我無法理解這一點，但感到絕望。 – mmarboeuf 2014-12-18 07:22:04

代碼不適合你嗎？如果不是，那麼錯誤或不正確的輸出是什麼？ – Scooter 2014-12-18 13:22:27

如何解析字符串在Python

回答

相關問題