你可以做這樣的事情與正則表達式假設你沒有,你還沒有上市的限制:
>>> s = "'Adult' 'Adverse Drug Reaction Reporting Systems/*classification' '*Drug-Related Side Effects and Adverse Reactions' 'Hospital Bed Capacity 300 to 499' 'Hospitals County' 'Humans' 'Indiana' 'Pharmacy Service Hospital/*statistics & numerical data'"
>>> import re
>>> regex = re.compile(r"'[^']*'")
>>> regex.findall(s)
["'Adult'", "'Adverse Drug Reaction Reporting Systems/*classification'", "'*Drug-Related Side Effects and Adverse Reactions'", "'Hospital Bed Capacity 300 to 499'", "'Hospitals County'", "'Humans'", "'Indiana'", "'Pharmacy Service Hospital/*statistics & numerical data'"]
我的正則表達式是留在琴絃'
- 您可以輕鬆地將其刪除與str.strip("'")
。
>>> [x.strip("'") for x in regex.findall(s)]
['Adult', 'Adverse Drug Reaction Reporting Systems/*classification', '*Drug-Related Side Effects and Adverse Reactions', 'Hospital Bed Capacity 300 to 499', 'Hospitals County', 'Humans', 'Indiana', 'Pharmacy Service Hospital/*statistics & numerical data']
注意,這只是工作,因爲我假設你沒有在字符串中的任何轉義引號...例如你從來沒有:
'foo\'bar'
其中是在許多編程情況下表達字符串的完全有效的方式。如果你做有這種情況,你需要使用更健壯的解析器 - 例如, pyparsing
:
>>> import pyparsing as pp
>>> [x[0][0].strip("'") for x in pp.sglQuotedString.scanString(s)]
['Adult', 'Adverse Drug Reaction Reporting Systems/*classification', '*Drug-Related Side Effects and Adverse Reactions', 'Hospital Bed Capacity 300 to 499', 'Hospitals County', 'Humans', 'Indiana', 'Pharmacy Service Hospital/*statistics & numerical data']
>>> s2 = r"'foo\'bar' 'baz'"
>>> [x[0][0].strip("'") for x in pp.sglQuotedString.scanString(s2)]
["foo\\'bar", 'baz']