2016-12-16 159 views
-2

的獨特的元素列表我有像這樣一維數組(其中的值被repeeated。)創建陣列

Administration Oral ,Aged ,Area Under Curve ,Cholinergic Antagonists/adverse effects/*pharmacokinetics/therapeutic use ,Circadian Rhythm/physiology ,Cross-Over Studies ,Delayed-Action Preparations ,Dose-Response Relationship Drug ,Drug Administration Schedule ,Female ,Humans ,Mandelic Acids/adverse effects/blood/*pharmacokinetics/therapeutic use ,Metabolic Clearance Rate ,Middle Aged ,Urinary Incontinence/drug therapy ,Xerostomia/chemically induced , 

Adult ,Anti-Ulcer Agents/metabolism ,Antihypertensive Agents/metabolism ,Benzhydryl Compounds/administration & dosage/blood/*pharmacology ,Caffeine/*metabolism ,Central Nervous System Stimulants/metabolism ,Cresols/administration & dosage/blood/*pharmacology ,Cross-Over Studies ,Cytochromes/*pharmacology ,Debrisoquin/*metabolism ,Drug Interactions ,Humans ,Male ,Muscarinic Antagonists/pharmacology ,Omeprazole/*metabolism ,*Phenylpropanolamine ,Polymorphism Genetic ,Tolterodine Tartrate ,Urinary Bladder Diseases/drug therapy , 
... 
... 

我所需要的所有獨特類別,其中類別由逗號分隔的列表。例如。管理口頭將是一個類別。

+1

向我們展示你做了什麼(代碼)到目前爲止並在那裏是你的困難/錯誤。 –

+0

Python使用列表,而不是數組作爲主數據類型。 –

+0

不知道,但我認爲numarray模塊允許數組,包括多維數組 – glant

回答

2

我需要的所有獨特的類別

採取任何名單和其應用set()的列表。注意:這消除了排序。

其中類別由逗號

分離所以split(",")字符串

例如。

s = '''Administration Oral ,Aged ,Area Under Curve ,Cholinergic Antagonists/adverse effects/*pharmacokinetics/therapeutic use ,Circadian Rhythm/physiology ,Cross-Over Studies ,Delayed-Action Preparations ,Dose-Response Relationship Drug'''.strip() 

for x in sorted(set(s.split(","))): 
    print(x.strip()) 

輸出

Administration Oral 
Aged 
Area Under Curve 
Cholinergic Antagonists/adverse effects/*pharmacokinetics/therapeutic use 
Circadian Rhythm/physiology 
Cross-Over Studies 
Delayed-Action Preparations 
Dose-Response Relationship Drug 
0

下面是一個例子:

categories = """Administration Oral ,Aged ,Area Under Curve ,Cholinergic Antagonists/adverse effects/*pharmacokinetics/therapeutic use ,Circadian Rhythm/physiology ,Cross-Over Studies ,Delayed-Action Preparations ,Dose-Response Relationship Drug ,Drug Administration Schedule ,Female ,Humans ,Mandelic Acids/adverse effects/blood/*pharmacokinetics/therapeutic use ,Metabolic Clearance Rate ,Middle Aged ,Urinary Incontinence/drug therapy ,Xerostomia/chemically induced ,Adult ,Anti-Ulcer Agents/metabolism ,Antihypertensive Agents/metabolism ,Benzhydryl Compounds/administration & dosage/blood/*pharmacology ,Caffeine/*metabolism ,Central Nervous System Stimulants/metabolism ,Cresols/administration & dosage/blood/*pharmacology ,Cross-Over Studies ,Cytochromes/*pharmacology ,Debrisoquin/*metabolism ,Drug Interactions ,Humans ,Male ,Muscarinic Antagonists/pharmacology ,Omeprazole/*metabolism ,*Phenylpropanolamine ,Polymorphism Genetic ,Tolterodine Tartrate ,Urinary Bladder Diseases/drug therapy ,""" 

category_list = [x.strip() for x in categories.split(',')] 
unique_categories = filter(None, list(set(category_list))) 
>>> unique_categories 
['Urinary Incontinence/drug therapy', 'Debrisoquin/*metabolism', 'Cresols/administration & dosage/blood/*pharmacology', 'Cholinergic Antagonists/adverse effects/*pharmacokinetics/therapeutic use', 'Urinary Bladder Diseases/drug therapy', '*Phenylpropanolamine', 'Drug Administration Schedule', 'Tolterodine Tartrate', 'Middle Aged', 'Dose-Response Relationship Drug', 'Polymorphism Genetic', 'Adult', 'Anti-Ulcer Agents/metabolism', 'Caffeine/*metabolism', 'Mandelic Acids/adverse effects/blood/*pharmacokinetics/therapeutic use', 'Area Under Curve', 'Metabolic Clearance Rate', 'Muscarinic Antagonists/pharmacology', 'Drug Interactions', 'Delayed-Action Preparations', 'Circadian Rhythm/physiology', 'Male', 'Xerostomia/chemically induced', 'Administration Oral', 'Cross-Over Studies', 'Benzhydryl Compounds/administration & dosage/blood/*pharmacology', 'Cytochromes/*pharmacology', 'Humans', 'Central Nervous System Stimulants/metabolism', 'Omeprazole/*metabolism', 'Female', 'Antihypertensive Agents/metabolism', 'Aged']