1
考慮到與數字的字符串:在串數字轉換爲`_NUM - * _`符號
I counted, ' 1 2 3 4 5 5 5 8 9 10 '
的目標是將數字轉換爲_NUM-*_
符號,其中*
表示,通過該號碼出現順序。例如。給定上述intpu,期望的輸出是:
"I counted, ' _NUM-1_ _NUM-2_ _NUM-3_ _NUM-4_ _NUM-5_ _NUM-6_ _NUM-7_ _NUM-8_ _NUM-9_ _NUM-10_'"
即使重複數字,例如,給定輸入
I said, ' 1 2 3 4 5 5 5 8 9 10 '
所需的輸出保持數忽略了數字本身例如值的順序:
"I said, ' _NUM-1_ _NUM-2_ _NUM-3_ _NUM-4_ _NUM-5_ _NUM-6_ _NUM-7_ _NUM-8_ _NUM-9_ _NUM-10_'"
我已經試過:
import re
s = "I counted, ' 1 2 3 4 5 6 7 8 9 10 '"
num_regexp = '(?<!\S)(?=.)(0|([1-9](\d*|\d{0,2}(,\d{3})*)))?(\.\d*[1-9])?(?!\S)'
re.sub(num_regexp, '_NUM_', s)
但它只是用相同的_NUM_
符號替換輸出而不保留順序,即
[OUT]:
"I counted, ' _NUM_ _NUM_ _NUM_ _NUM_ _NUM_ _NUM_ _NUM_ _NUM_ _NUM_ _NUM_ _NUM_ '"
我可以做一個後re.sub
操作,更換各_NUM_
,即
import re
s = "I counted, ' 1 2 3 4 5 6 7 8 9 10 '"
num_regexp = '(?<!\S)(?=.)(0|([1-9](\d*|\d{0,2}(,\d{3})*)))?(\.\d*[1-9])?(?!\S)'
num_counter = 1
tokens = []
for token in re.sub(num_regexp, '_NUM_', s).split():
if token == '_NUM_':
token = '_NUM-{}_'.format(num_counter)
num_counter += 1
tokens.append(token)
result = ' '.join(tokens)
[出]:
"I counted, ' _NUM-1_ _NUM-2_ _NUM-3_ _NUM-4_ _NUM-5_ _NUM-6_ _NUM-7_ _NUM-8_ _NUM-9_ _NUM-10_ '"
是一種更好的方式來實現所需的輸出沒有先通用re.sub
,然後進行事後字符串編輯?
酷!我不知道'itertools.count',在替換內部使用lambda表達式超級酷! – alvas