將時間段字符串轉換爲值/單元對

我需要解析表示時間段的字符串的內容。字符串的格式是值/單位，例如：1s，60min，24h。我會將實際值（一個int）和單位（一個str）分隔開來的變量。將時間段字符串轉換爲值/單元對

目前，我不喜歡這樣寫道：

def validate_time(time): 
    binsize = time.strip() 
    unit = re.sub('[0-9]','',binsize) 
    if unit not in ['s','m','min','h','l']: 
     print "Error: unit {0} is not valid".format(unit) 
     sys.exit(2) 
    tmp = re.sub('[^0-9]','',binsize) 
    try: 
     value = int(tmp) 
    except ValueError: 
     print "Error: {0} is not valid".format(time) 
     sys.exit(2) 
    return value,unit

然而，這不是理想的象外之象1M0也（錯誤地）驗證（value=10，unit=m）。

什麼是驗證/解析此輸入的最佳方法？

來源

2012-09-06 Rafael Barbosa

只需解析與一個正則表達式的整個行：

_timeunit = re.compile(r'^(?P<value>\d+)(?P<unit>s|m|min|h|l)$') 
def validate_time(time): 
    match = _timeunit.match(time.strip()) 
    if match is None: 
     print "Error: {0} is not valid".format(time) 
     sys.exit(2) 

    return int(match.group('value')), match.group('unit')

演示（與具有返回暫時取代sys.exit）：

>>> validate_time('10l') 
(10, 'l') 
>>> validate_time('10l0') 
Error: 10l0 is not valid

正則表達式在開頭匹配位數（由^插入符號匹配），那麼來自有限集合s,m,min,h或l的單元，但只有當它們位於第e行，符合$美元符號。

在驗證方法中引發異常，btw，並在您調用該方法的地方處理該異常會更加pythonic。這使得它更可重用：

_timeunit = re.compile(r'^(?P<value>\d+)(?P<unit>s|m|min|h|l)$') 
def validate_time(time): 
    match = _timeunit.match(time.strip()) 
    if match is None: 
     raise ValueError('{0} is not a valid timespan'.format(time))  
    return int(match.group('value')), match.group('unit') 

try: 
    validate_time(foobar) 
except ValueError, e: 
    print 'Error: {0}'.format(e.args[0]) 
    sys.exit(2)

來源

2012-09-06 16:15:28

我會在正則表達式的開始和結尾處包含空格，而不是使用'strip'，並且在值和單位之間也包含可選的空白。從函數調用'exit'並不酷，只是引發包含錯誤消息的異常。 –

我使用'sys.exit（）'回顯OP;我非常同意。 –

爲什麼不能在1個去分析它：

m = re.match(r'(?P<value>\d+)\s*(?P<unit>\w+)',time.strip()) 
#    #^number 
#       #^spaces (optional) 
#        #^non-number 
if not m: 
    pass #error 

if m.group(0) != time.strip(): 
    pass #error -- leftover stuff at the end. This will catch '1m0' 

unit = m.group('unit') 
value = int(m.group('value'))

來源

2012-09-06 15:36:00 mgilson

試試這個：

#!/usr/bin/env python 

import re 

time = '1m' 
try: 
    m = re.match(r'^(?P<value>\d+)(?P<middle>.*)(?P<unit>(m|min|s|h|l))$', 
    time.strip()) 
    value = m.group('value') 
    unit = m.group('unit') 
    middle = m.group('middle') 
    if middle : 
    # Reuse the exception raised if an expected group is not found 
    raise AttributeError 
    print value, unit 
except AttributeError: 
    print "Wrong format"

它確保時間以數字開頭，並以有效的單元結束，並在捕捉什麼中間。

來源

2012-09-06 16:10:41 Vicent

將時間段字符串轉換爲值/單元對

回答

相關問題