python正則表達式前10名

html_log:jeff 1153.3 1.84 625:54 1 2 71 3 2 10 7:58 499 3 5 616:36 241 36   
html_log:fred 28.7 1.04 27:34 -10 18 13 0:48 37 18 8 -3.63 
html_log:bob 1217.1 1.75 696:48 1 5 38 6 109 61 14:42 633 223 25 435:36 182 34 
... continues

上面是文本文件。現在python正則表達式前10名

mystats = fo.readlines() 
fo.close() 

change = str(mystats) 

pattern = re.compile("html_log:(?P<name>[^ ]*) (?P<score>[^ ]*)") 
mylist=sorted(pattern.findall(change), key=lambda x: float(x[1]), reverse=True)

我的輸出是

bob 1217.1 
jeff 1153.3 
fred 28.7

問題.. 我想不過來獲得第五int值和我的輸出應該是

bob 5 
jeff 2 
fred 18

我不知道模式僅匹配第5個值。

來源

2013-05-31 user2371027

我沒有看到「前10名」與您的問題有關？你是在輸入的第5個元素之後（如你的例子所示），還是排序後的第5個元素？ –

這個怎麼樣的正則表達式：

html_log:(?P<name>[^ ]*)(?: [^\s]+){4} (?P<score>[^ ]*)

的測試見here。

來源

2013-05-31 10:28:00 Jerry

你並不真正需要正則表達式。

s = [line.split() for line in file] 
[(x[0].split(':')[1], float(x[5])) for x in s]

來源

2013-05-31 10:29:53 Elazar

「bob」或「jeff」的名字在哪裏？ –

查看'split [0]'。 – tripleee

多一點傳統，但生存短期或空行：

import io # Python 3 use StringIO in Python 2 
fobj = io.StringIO(""" 
html_log:jeff 1153.3 1.84 625:54 1 2 71 3 2 10 7:58 499 3 5 616:36 241 36   
html_log:fred 28.7 1.04 27:34 -10 18 13 0:48 37 18 8 -3.63 
html_log:bob 1217.1 1.75 696:48 1 5 38 6 109 61 14:42 633 223 25 435:36 182 34""") 

scores = [] 
for line in fobj: 
    split_line = line.split() 
    try: 
     scores.append((int(split_line[5]), split_line[0].split(':')[1])) 
    except IndexError: 
     continue

我們需要對它們進行排序。越大越好：

top_ten = sorted(scores, reverse=True)[:10]

，並顯示他們更好一點：

for score, name in top_ten: 
    print(name, score)

輸出：

fred 18 
bob 5 
jeff 2

來源

2013-05-31 10:43:39

相當於列表理解中的一個簡單的「if」。 – Elazar

我想你的意思是'如果len（line.split（））> 5'。這將導致另一條線路的分裂。還有OP正在尋找的名字。不知道如何在列表理解的情況下獲得名稱，而不需要再次分割線，總共需要三次分割。當從文件中逐行讀取時，名稱需要與'html_log：'分開。 –

使用此模式：

pattern = re.compile(r'html_log:([^ ]*) (?:[^ ]+){4}([^ ]*)')

它跳過4號和捕獲第五名。

來源

2013-05-31 13:53:31 kirelagin

python正則表達式前10名

回答

相關問題