2017-03-25 50 views
0

這是我的文本文件的前20行,我有這樣的50K行。python字典和功能不工作

prov_type|prov_type_desc 
0|FAMILY PRACTICE/CLINIC 
1|FAMILY PRACTICE 
2|ALLERGIST 
3|DERMATOLOGIST 
4|INTERNIST 
5|NEUROLOGIST 
6|NEUROSURGEON 
7|OB/GYN 
8|OPTHAMOLOGIST 
9|ORTHOPEDIST 
10|OTOLARYNGOLOGIST 
11|PATHOLOGIST 
12|PEDIATRICIAN 
13|PLASTIC SURGEON 
14|COLON AND RECTAL SURGERY 
15|PSYCHIATRIST 
16|RADIOLOGIST 
17|SURGEON 
18|THORACIC SURGEON 
19|UROLOGIST 
20|ANESTHESIOLOGIST 

我讀這樣的,

ovations = pd.read_csv("Ovations.txt",sep='|',dtype=object) 
ovations.rename(columns={'prov_type_desc':'specialty'},inplace=True) 

我寫了一本字典,以匹配特產,這裏是字典。

options = {'FAMILYPRACTICESELF-REFFERAL' : 'FAMILY PRACTICE', 
'FAMILYPRACTICESPECIALIST' : 'FAMILY PRACTICE', 
'FAMILYPRACTICE/CLINIC' : 'FAMILY PRACTICE', 
'GENERALPRACTICE' : 'FAMILY PRACTICE', 
'ALLERGY' : 'ALLERGIST', 
'ALLERGYANDIMMUNOLOGY' : 'ALLERGIST', 
'ALLERGY&IMMUNOLOGY' : 'ALLERGIST', 
'ALLERGY/IMMUNOLOGY' : 'ALLERGIST', 
'CARDIOLOGY' : 'CARDIOLOGIST', 
'CARDIOLOGYGROUP' : 'CARDIOLOGIST', 
'CARDIOVASCULARDISEASE' : 'CARDIOLOGIST', 
'COLON&RECTALSURGERY' : 'COLON AND RECTAL SURGERY', 
'COLON/RECTALSURGERY' : 'COLON AND RECTAL SURGERY', 
'COLORECTALSURGERY' : 'COLON AND RECTAL SURGERY', 
'DERMATOLOGYGROUP' : 'DERMATOLOGIST', 
'DERMATOLOGY' : 'DERMATOLOGIST', 
'ENDOCRINOLOGY,DIABETES,ANDMETABOLISM' : 'ENDOCRINOLOGIST', 
'ENDOCRINOLOGY' : 'ENDOCRINOLOGIST', 
'ENDODONDIST' : 'ENDODONTICS', 
'GASTROENTEROLOGY' : 'GASTROENTEROLOGIST', 
'GASTROENTEROLOGYGROUP' : 'GASTROENTEROLOGIST', 
'GENETICCOUNSELOR' : 'GENETIC TESTING/COUNSELING CENTER', 
'GENETICS,CLINICAL(MD)' : 'GENETIC TESTING/COUNSELING CENTER', 
'GENETICS,CLINICALMOLECULAR' : 'GENETIC TESTING/COUNSELING CENTER', 
'HEMATOLOGYONCOLOGY' : 'HEMATOLOGY/ONCOLOGY', 
'HEMATOLOGIST' : 'HEMATOLOGY/ONCOLOGY', 
'HEMATOLOGY' : 'HEMATOLOGY/ONCOLOGY', 
'HEMATOLOGYGROUP' : 'HEMATOLOGY/ONCOLOGY', 
'HEMATOLOGY-ONCOLOGY' : 'HEMATOLOGY/ONCOLOGY', 
'HEMATOLOGY-ONCOLOGYGROUP' : 'HEMATOLOGY/ONCOLOGY', 
'HOSPICE&PALLATIVEMED' : 'HOSPICE', 
'HOSPITALOP/LAB/XRAY' : 'HOSPITAL', 
'HOSPITALIST' : 'HOSPITAL', 
'INFECTIOUSDISEASEMEDICINE' : 'INFECTIOUS DISEASE', 
'INTERNALMED' : 'INTERNAL MEDICINE', 
'INTERNALMEDICINESPECIALIST' : 'INTERNAL MEDICINE', 
'INTERNIST' : 'INTERNAL MEDICINE', 
'INFECTIOUSDISEASESEPCIALIST' : 'INFECTIOUS DISEASE', 
'NEPHROLOGY' : 'NEPHROLOGIST', 
'NEUROLOGY' : 'NEUROLOGIST', 
'OBSTETRICS' : 'OBSTETRICS AND GYNECOLOGY', 
'OBSTETRICS&GYNECOLOGY' : 'OBSTETRICS AND GYNECOLOGY', 
'OBSTETRICS/GYNECOLOGY' : 'OBSTETRICS AND GYNECOLOGY', 
'OB/GYNGROUP' : 'OBSTETRICS AND GYNECOLOGY', 
'OBSTETRICSGYNECOLOGY' : 'OBSTETRICS AND GYNECOLOGY', 
'OBGYNECOLOGISTSPECIALTY' : 'OBSTETRICS AND GYNECOLOGY', 
'OB/GYN' : 'OBSTETRICS AND GYNECOLOGY', 
'OB/GYNSELFREFCAP' : 'OBSTETRICS AND GYNECOLOGY', 
'GYNECOLOGY' : 'OBSTETRICS AND GYNECOLOGY', 
'ONCOLOGY' : 'ONCOLOGIST', 
'GYNECOLOGICONCOLOGY' : 'ONCOLOGIST', 
'GYNECOLOGICALONCOLOGY' : 'ONCOLOGIST', 
'GYNECOLOGICAL/ONCOLOGY' : 'ONCOLOGIST', 
'OPHTHALMOLOGY' : 'OPTHAMOLOGIST', 
'OTOLARYNGOLOGY' : 'OTOLARYNGOLOGIST', 
'OTOLARYNGOLOGY(ENT)' : 'OTOLARYNGOLOGIST', 
'PATHOLOGY' : 'PATHOLOGIST', 
'PATHOLOGYSERVICES' : 'PATHOLOGIST', 
'PATHOLOGY,ANATOMIC' : 'PATHOLOGIST', 
'CYTOPATHOLOGY' : 'PATHOLOGIST', 
'PATHOLOGY,ANATOMICAL&CLINICAL' : 'PATHOLOGIST', 
'PATHOLOGY,BLOOD BANKING/TRANSFUSIONMED' : 'PATHOLOGIST', 
'PATHOLOGY,CLINICAL' : 'PATHOLOGIST', 
'PATHOLOGY,CYTOPATHOLOGY' : 'PATHOLOGIST', 
'PATHOLOGY,DERMATOPATHOLOGY' : 'PATHOLOGIST', 
'PATHOLOGY,HEMATOLOGY' : 'PATHOLOGIST', 
'PATHOLOGY,IMMUNOPATHOLOGY' : 'PATHOLOGIST', 
'PATHOLOGY,NEUROPATHOLOGY' : 'PATHOLOGIST', 
'DERMATOLOGY-DERMATOPATHOLOGY' : 'PATHOLOGIST', 
'DERMATOPATHOLOGY' : 'PATHOLOGIST', 
'PEDIATRICMEDICINE' : 'PEDIATRICIAN', 
'PEDIATRSELFREFCAP' : 'PEDIATRICIAN', 
'PEDIATRICSPECIALTYIALIST' : 'PEDIATRICIAN', 
'PEDIATRICS' : 'PEDIATRICIAN', 
'PEDIATRICSSPECIALTYIALIST' : 'PEDIATRICIAN', 
'PLASTICANDRECONSTRUCTIVESURGERY' : 'PLASTIC SURGEON', 
'PLASTICSURGERY' : 'PLASTIC SURGEON', 
'PLASTICSURGERYWITHINTHEHEAD&NECK' : 'PLASTIC SURGEON', 
'PSYCHIATRY' : 'PSYCHIATRIST'} 

我爲了得到該鍵的值寫了這樣的功能,

def key_in_dic(p): 
    return next((options[x] for x in p if x in options), 'Other') 
ovations['specialty_adj'] = key_in_dic(list(ovations['specialty'])) 

它無法按預期工作,有什麼能在這個問題?

下面是我如何,我應該返回其他非匹配鍵,它是ALLERGIST,但事實並非如此。

enter image description here 謝謝。

+1

也許補充,它是如何工作的,並突出它應該如何不匹配? – Dilettant

+1

已更新,請檢查 – subro

+0

爲什麼不使用'options.get(x,default ='Other')'爲不存在的專業指定默認值? – Barmar

回答

1

正如Barmar已經指出的那樣,您可以使用get字典的方法。我認爲以下應該給你想要的東西:

ovations["specialty_adj"] = ovations["specialty"].apply(lambda x: options.get(x, "Other")) 
+0

謝謝,它按預期工作。 – subro

+0

還有一個幫助,我想返回'專業',如果它不匹配,你能建議我該怎麼做? – subro

+0

樂意幫忙。請注意,您需要在這裏爲字符串比較匹配拼寫,因爲'options'包含'FAMILYPRACTICE/CLINIC',但在數據框中它被寫入'FAMILY PRACTICE/CLINIC'並且有空格。根據你想達到的目標,你可以嘗試在lambda表達式中使用'options.get(x.replace(「,」「),」Other「',但你應該接受這個答案作爲正確的答案, – ValD

1

使用dict.get()方法指定找不到密鑰時的默認值。

def key_in_dict(p): 
    return (options.get(x, default='Other') for x in p) 
+1

它給我'<生成器對象key_in_dic。 subro