從python中的字符串中提取字段

我有一行一行的文本，其中包含許多字段名稱和它們的值，如下所示：如果任何行沒有任何字段值，則該字段不會存在於該行中例如從python中的字符串中提取字段

First line: 
A:30 B: 40 TS:1/1/1990 22:22:22 
Second line 
A:30 TS:1/1/1990 22:22:22 
third line 
A:30 B: 40

但確認最多3個字段是可能的單行，他們的名字將是A，B，TS。當我爲這個編寫python腳本時，我遇到了下面的問題： 1）我必須從每一行中提取哪些字段存在以及它們的值是什麼 2）字段TS的字段值也具有分隔符''（空格）。所以無法檢索TS的全部價值（1990年1月1日22點22分22秒）

輸出valueshould提取像

First LIne: 
A=30 
B=40 
TS=1/1/1990 22:22:22 

Second Line: 
A=30 

TS=1/1/1990 22:22:22 

Third Line 
A=30 
B=40

請幫我解決這個問題。

來源

2010-11-04 james

這是真的，但這是一個理由downvote他的問題？我覺得它非常有效。而且，如果你不斷下調他的表現，他就失去了獲勝的權利，我們不想要這樣做，對嗎？ :) – 2010-11-04 09:04:52

import re 
a = ["A:30 B: 40 TS:1/1/1990 22:22:22", "A:30 TS:1/1/1990 22:22:22", "A:30 B: 40"] 
regex = re.compile(r"^\s*(?:(A)\s*:\s*(\d+))?\s*(?:(B)\s*:\s*(\d+))?\s*(?:(TS)\s*:\s*(.*))?$") 
for item in a: 
    matches = regex.search(item).groups() 
    print {k:v for k,v in zip(matches[::2], matches[1::2]) if k}

將輸出

{'A': '30', 'B': '40', 'TS': '1/1/1990 22:22:22'} 
{'A': '30', 'TS': '1/1/1990 22:22:22'} 
{'A': '30', 'B': '40'}

正則表達式的說明：

^\s*  # match start of string, optional whitespace 
(?:  # match the following (optionally, see below) 
(A)  # identifier A --> backreference 1 
\s*:\s* # optional whitespace, :, optional whitespace 
(\d+) # any number --> backreference 2 
)?  # end of optional group 
\s*  # optional whitespace 
(?:(B)\s*:\s*(\d+))?\s* # same with identifier B and number --> backrefs 3 and 4 
(?:(TS)\s*:\s*(.*))?  # same with id. TS and anything that follows --> 5 and 6 
$   # end of string

來源

2010-11-04 09:03:10

感謝您的幫助，它解決了我的問題 – james 2010-11-04 09:29:08

在繼續上面的線程，我有下面的行| | |答：720897 | N°227：AT圈，我用regex = re.compile（r「\ s *（？:(Link Id）\ s *：\ s *（\ d +））\ s * | \ s *（?: （N°（\ D +））\ S *：\ S *（。*））$「），但它沒有給出所需的結果.pls讓我知道我錯在哪裏。我已經使用＃ - * - 編碼：ISO -8859-1 - * - – james 2010-11-09 07:13:27

@james：代碼格式化在評論中很難;你能編輯你的答案並將你的新例子格式化爲代碼，所以我可以更好地看到問題出在哪裏？謝謝。 – 2010-11-09 08:42:41

您可以使用正則表達式，如果每次假定訂單的順序相同，則可以使用正則表達式，否則如果您不確定訂單，則必須單獨匹配每個零件。

import re 

def parseInput(input): 
    m = re.match(r"A:\s*(\d+)\s*B:\s*(\d+)\s*TS:(.+)", input) 
    return {"A": m.group(1), "B": m.group(2), "TS": m.group(3)} 

print parseInput("A:30 B: 40 TS:1/1/1990 22:22:22")

這打印出來{'A': '30', 'B': '40', 'TS': '1/1/1990 22:22:22'}這只是一個包含值的字典。

P.S.你應該接受一些答案並熟悉網站的禮節，人們會更願意幫助你。

來源

2010-11-04 09:01:43 ameer

從python中的字符串中提取字段

回答

相關問題