您好我是Python和RegEx的新手。我正在嘗試使用這兩種方法,試圖獲得一個正則表達式來從用戶提取數據,但我期望不同的輸入考慮錯別字等。因此,在下面的代碼中,我隨機選擇了一些類型的字符串,我希望用戶給你舉個例子他們如何輸入數據。我只對美元之前或之後的數字感興趣。例如:字符和數字的多個字符串的有效正則表達式
ran = random.randint(1, 7)
print str(ran)
if ran == 1:
examplestring = "This item costs 20 USD contact 9999999"
elif ran == 2:
examplestring = "This item costs USD 20"
elif ran == 3:
examplestring = "This item costs 20 U.S.D"
elif ran == 4:
examplestring = "This item costs 20 usd"
elif ran == 5:
examplestring = "This item costs 20 Usd call to buy : 954545577"
elif ran == 6:
examplestring = "This item costs 20USD"
elif ran == 7:
examplestring = "This item costs usd20"
regex = re.compile(r'\busd|\bu.s.d\b|\bu.s.d.\b', re.I)
examplestring = regex.sub("USD", examplestring)
costs = re.findall(r'\d+.\bUSD\b|\bUSD\b.\d+|\d+USD\b|\bUSD\d+', examplestring)
cost = "".join(str(n) for n in costs[0])
cost = ''.join(x for x in cost if x.isdigit())
print cost + " USD"
使用這些正則表達式我可以得到我想要的是「20美元」的細節。我的問題是,如果我以正確的方式進行,並且能夠使代碼更好?
你可以做到這一切與一個正則表達式:'(:(<= USD | USD)\ S *(\ d +)?)|(?:\ d + \ s *(?= USD | usd | Usd | USD))'但是由於正則表達式的複雜性,有時這不是一個好的方法。請參閱[此處](https://regex101.com/r/mH0cC8/1)有關它的工作原理的解釋。 – RedX