2016-12-15 172 views
0

我有這樣的字符串:拆分Python字符串由單引號

text = ['Adult' 'Adverse Drug Reaction Reporting Systems/*classification' '*Drug-Related Side Effects and Adverse Reactions' 'Hospital Bed Capacity 300 to 499' 'Hospitals County' 'Humans' 'Indiana' 'Pharmacy Service Hospital/*statistics & numerical data'] 

我需要這個串,其中每個類別(由單quotaions標記分隔被存儲在一個陣列)分開。例如:

text = Adult, Adverse Drug Reaction Reporting Systems... 

我已經嘗試過拆分功能,但不確定如何去做。

回答

1

你可以做這樣的事情與正則表達式假設你沒有,你還沒有上市的限制:

>>> s = "'Adult' 'Adverse Drug Reaction Reporting Systems/*classification' '*Drug-Related Side Effects and Adverse Reactions' 'Hospital Bed Capacity 300 to 499' 'Hospitals County' 'Humans' 'Indiana' 'Pharmacy Service Hospital/*statistics & numerical data'" 
>>> import re 
>>> regex = re.compile(r"'[^']*'") 
>>> regex.findall(s) 
["'Adult'", "'Adverse Drug Reaction Reporting Systems/*classification'", "'*Drug-Related Side Effects and Adverse Reactions'", "'Hospital Bed Capacity 300 to 499'", "'Hospitals County'", "'Humans'", "'Indiana'", "'Pharmacy Service Hospital/*statistics & numerical data'"] 

我的正則表達式是留在琴絃' - 您可以輕鬆地將其刪除與str.strip("'")

>>> [x.strip("'") for x in regex.findall(s)] 
['Adult', 'Adverse Drug Reaction Reporting Systems/*classification', '*Drug-Related Side Effects and Adverse Reactions', 'Hospital Bed Capacity 300 to 499', 'Hospitals County', 'Humans', 'Indiana', 'Pharmacy Service Hospital/*statistics & numerical data'] 

注意,這只是工作,因爲我假設你沒有在字符串中的任何轉義引號...例如你從來沒有:

'foo\'bar'其中在許多編程情況下表達字符串的完全有效的方式。如果你有這種情況,你需要使用更健壯的解析器 - 例如, pyparsing

>>> import pyparsing as pp 
>>> [x[0][0].strip("'") for x in pp.sglQuotedString.scanString(s)] 
['Adult', 'Adverse Drug Reaction Reporting Systems/*classification', '*Drug-Related Side Effects and Adverse Reactions', 'Hospital Bed Capacity 300 to 499', 'Hospitals County', 'Humans', 'Indiana', 'Pharmacy Service Hospital/*statistics & numerical data'] 
>>> s2 = r"'foo\'bar' 'baz'" 
>>> [x[0][0].strip("'") for x in pp.sglQuotedString.scanString(s2)] 
["foo\\'bar", 'baz']