我的文檔正則表達式來獲得測量
5.3 x 2.5 cm
11 x 11 mm
7 mm
13 x 12 x 14 mm
13x12cm
我需要使用Python使用正則表達式來提取5.3×2.5釐米的這些測量。
到目前爲止,我的代碼如下,但它不能正常工作
x = "\.\d{1,2}|\d{1,4}\.?\d{0,2}|\d{5}\.?\d?|\d{6}\.?"
by = "()?(by|x)()?"
cm = "(mm|cm|millimeter|centimeter|millimeters|centimeters)"
x_cm = "((" + x + " *(to|\-) *" + cm + ")" + "|(" + x + cm + "))"
xy_cm = "((" + x + cm + by + x + cm + ")" +"|(" + x + by + x + cm + ")" +"|(" + x + by + x + "))"
xyz_cm = "((" + x + cm + by + x + cm + by + x + cm + ")" + "|(" + x + by + x + by + x + cm + ")" + "|(" + x + by + x + by + x + "))"
m = "((" + xyz_cm + ")" + "|(" + xy_cm + ")" + "|(" + x_cm + "))"
a = re.compile(m)
print a.findall(text)
輸出它給:
[('13', '13', '13', '13', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', ''), ('12', '12', '12', '12', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', ''), ('4', '4', '4', '4', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', ''), ('25', '25', '25', '25', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', ''),
定義「不能正常工作」:它做什麼與做什麼應該做什麼?例子將是最受歡迎的。 –
請顯示並解釋您所獲得的輸出與您想要的輸出之間的差異。 – Yunnosch
你必須做的一件事是擺脫捕獲組。但是,您應該在連接後檢查[最終模式](https://regex101.com/r/LcTavz/1),它僅[返回數字](https://ideone.com/TOX9eK)。 –