2013-05-26 65 views
1

我想從字符串轉換(字幕):蟒正則表達式的子%H:%M:%S到%M:%S或01:%M:%S

585 
00:59:59,237 --> 01:00:01,105 
- It's all right. - He saw us! 

586 
01:00:01,139 --> 01:00:03,408 
I heard you the first time. 

59:59 - 沒關係。 - 他看到我們了!

01:00:01我第一次聽到你。

* 我要的是:如果時間是一個小時內,剪掉了「00」的前綴,而保持它,如果時間大於1小時*

我正則表達式是:

pat = re.compile(r""" 
    #\s*     # Skip leading whitespace 
    \d+\s     # remoe lines contain only numbers 
    ((?:(?:00)|(?P<hour>01)):(?P<time>\d{2}:\d{2})[,0-9->]+.*)[\r\n]+(?P<content>.*)[\r\n]+ 
    """, 
    re.VERBOSE) 
data = pat.sub(r"\g<hour>\g<time> \g<content>", data) 

只有當'\g<hour>'沒有被使用時它纔會起作用。 任何人都可以幫助我嗎?

回答

2

我想,這是你在找什麼:

import re 

s = """ 
585 
00:59:59,237 --> 01:00:01,105 
- It's all right. - He saw us! 

586 
01:00:01,139 --> 01:00:03,408 
I heard you the first time. 
""" 

for line in re.findall(r'(\d+:)(\d+:\d+)(?:.*\n)(.*)', s): 
    if line[0] == '00:': 
     print ' '.join(line[1:]) 
    else: 
     print ' '.join([''.join(line[0:2]), line[2]]) 

輸出:

# 59:59 - It's all right. - He saw us! 
# 01:00:01 I heard you the first time. 
+0

對不起,我沒有很好地說明我的問題。我想要的是:如果時間在一個小時內,請修剪掉「00:」前綴,如果時間大於1小時,請保留它。 – Brent81

+0

@ Brent81我已經編輯了您的腳本。如果您認爲我的解決方案有用,請投票並接受它。謝謝!哦,下一次,請更具體! –

+1

非常感謝! – Brent81

1

只是給非重的方法(這應該是更快):

a = """585 
00:59:59,237 --> 01:00:01,105 
- It's all right. - He saw us! 

586 
01:00:01,139 --> 01:00:03,408 
I heard you the first time.""" 

for i, x in enumerate(a.split('\n')): 
    m = i % 4 
    if m == 0: 
     continue 
    elif m == 3: 
     continue 
    elif m == 1: 
     print x[:x.find(":", x.find(":") + 1)], 
    elif m == 2: 
     print x 
+0

輸出是:'58 00:59:59,237 - > 01:00:01,105'和'58 01:00:01,139 - > 01:00:03,408'這不是OP想要的 - 編輯:對不起('\ n') –

+0

@PeterVaro - 這只是非're'方法的大綱 – zenpoy

+0

** mine:**'466個函數調用(438)原始調用)在0.001秒內,**您的:**'在0.000秒內完成7個函數調用' –

相關問題