2013-08-02 94 views
0

我試圖創建一個字典列表,其中每個字典鍵是一個作業,每個值都是與該作業關聯的能力列表。不附加到列表

例:

[{'clerk': ['math ability','writing ability',...etc]},{'salesman':['charisma','writing ability','etc']}] 

這是我的工作數據:

O*NET-SOC Code Element ID Element Name Scale ID Data Value N Standard Error Lower CI Bound Upper CI Bound Recommend Suppress Not Relevant Date Domain Source 
11-1011.00 1.A.1.a.1 Oral Comprehension IM 4.5 8 0.19 4.13 4.87 N n/a Jun-06 Analyst 
11-1011.00 1.A.1.a.1 Oral Comprehension LV 4.75 8 0.25 4.26 5.24 N N Jun-06 Analyst 
11-1011.00 1.A.1.a.2 Written Comprehension IM 4.38 8 0.18 4.02 4.73 N n/a Jun-06 Analyst 

這是我到目前爲止已經完成:

首先我創建了一個字典列表,每個表示上面數據中的一行,其中鍵=列名稱和vals =列值。示例:

OrderedDict([('Domain Source', 'Analyst'), ('Recommend Suppress', 'N'), ('Standard Error', '0.19'), ('Element ID', '1.A.1.a.1'), ('N', '8'), ('Scale ID', 'IM'), ('Not Relevant', 'n/a'), ('Element Name', 'Oral Comprehension'), ('Lower CI Bound', '4.13'), ('Date', '06/2006'), ('Data Value', '4.50'), ('Upper CI Bound', '4.87'), ('O*NET-SOC Code', '11-1011.00')]), OrderedDict([('Domain Source', 'Analyst'), ('Recommend Suppress', 'N'), ('Standard Error', '0.25'), ('Element ID', '1.A.1.a.1'), ('N', '8'), ('Scale ID', 'LV'), ('Not Relevant', 'N'), ('Element Name', 'Oral Comprehension'), ('Lower CI Bound', '4.26'), ('Date', '06/2006'), ('Data Value', '4.75'), ('Upper CI Bound', '5.24'), ('O*NET-SOC Code', '11-1011.00')]), OrderedDict([('Domain Source', 'Analyst'), ('Recommend Suppress', 'N'), ('Standard Error', '0.18'), ('Element ID', '1.A.1.a.2'), ('N', '8'), ('Scale ID', 'IM'), ('Not Relevant', 'n/a'), ('Element Name', 'Written Comprehension'), ('Lower CI Bound', '4.02'), ('Date', '06/2006'), ('Data Value', '4.38'), ('Upper CI Bound', '4.73'), ('O*NET-SOC Code', '11-1011.00')]), OrderedDict([('Domain Source', 'Analyst'), ('Recommend Suppress', 'N'), ('Standard Error', '0.32'), ('Element ID', '1.A.1.a.2'), ('N', '8'), ('Scale ID', 'LV'), 

然後我嘗試合併的字典到更少的字典,其中每個關鍵是職位編號及每個值是與工作相關的技能列表。

def add_abilites(abilites_m_l): 
    jobs_list = [] 
    for ind, dict in enumerate(abilites_m_l): 
     activities_list = [] 
     if abilities_m_l[ind-1]['O*NET-SOC Code'] == abilities_m_l[ind]['O*NET-SOC Code']: 
      if abilities_m_l[ind]['Element Name'] != abilities_m_l[ind-1]['Element Name']: 
       activities_list.append(abilities_m_l[ind]['Element Name']) 
      else: pass 
     else: list.append({abilities_m_l[ind]['O*NET-SOC Code']:activities_list})   
    return jobs_list 
a_l_with_abilities = add_abilites(abilities_m_l) 
print a_l_with_abilities 

我得到以下輸出:

[{'11-1011.00': []}, {'11-1021.00': []}, {'11-2011.00': []}, {'11-2021.00': []}, {'11-2022.00': []}, {'11-2031.00': []}, {'11-3011.00': []}, {'11-3021.00': []}, {'11-3031.01': []}, {'11-3031.02': []}, {'11-3051.00': []}, {'11-3051.01': []}, {'11-3051.02': []}, {'11-3051.04': []}, {'11-3061.00': []}, {'11-3071.01': []}, {'11-3071.02': []}, {'11-3071.03': []}, {'11-3111.00': []}, {'11-3121.00': []}, {'11-3131.00': []}, {'11-9013.01': []}, {'11-9013.03': []}, {'11-9021.00': []}, {'11-9031.00': []}, {'11-9032.00': []}, {'11-9033.00': []}, {'11-9041.00': []}, {'11-..... 

換句話說,我的名單沒有被填滿。

+4

不要命名你的列表'list'。 – wflynny

+1

詳細闡述比爾的觀點 - 這將覆蓋名字空間中的內建列表類型。這不僅僅是一個風格問題。 –

+0

這同樣不是一個好的形式來命名你的字典'字典'。 – lmjohns3

回答

1

核心問題是您將activities_list重新分配到您的abilities_m_l中每個字典的空白列表。因此,當您檢測到更改的'O * NET-SOC代碼'值時,您會追加剛剛重新分配的空列表。

這裏是要做到這一點更清潔的方式:

def add_abilities(abilities_m_l): 
    jobs_dict = OrderedDict() 
    for data_dict in abilities_m_l: 
     o_code = data_dict['O*NET-SOC Code'] 
     activity = data_dict['Element Name'] 
     activities_so_far = jobs_dict.setdefault(o_code, OrderedDict()) 
     activities_so_far[activity] = True 
    return [{o_code: activities.keys()} for o_code, activities in jobs_dict.iteritems()] 

或者,如果你在Python 3中,其中keysvaluesitems調用返回iterables而不是名單:

return [{o_code: list(activities.keys())} for o_code, activities in jobs_dict.items()] 

或者如果您不需要保存活動的順序,請使用set進行活動。這是可取的,但不幸的是,Python不具備本地OrderedSet,所以我使用包含TrueOrderedDict近似於上面的代碼。

def add_abilities(abilities_m_l): 
    jobs_dict = OrderedDict() 
    for data_dict in abilities_m_l: 
     o_code = data_dict['O*NET-SOC Code'] 
     activity = data_dict['Element Name'] 
     activities_so_far = jobs_dict.setdefault(o_code, set) 
     activities_so_far.add(activity) 
    return [{o_code: list(activities)} for o_code, activities in jobs_dict.iteritems()] 

的一點是讓Python的字典收集有關的共享密鑰的信息,並且保持每個代碼活動的獨特性。

+0

彼得很有幫助!因此,我嘗試將data_dicts列表傳遞給此函數,並獲取文件「/ private/var/folders/jv/9_sy0bn10mbdft1bk9t14qz40000gn/T/Cleanup At Startup/bls-397498141.629.py」,第229行,在 abilities_struct = add_abilities(abilities_m_l ) 返回[{o_code,activities.keys()} for o_code,活動in()中的文件「/ private/var/folders/jv/9_sy0bn10mbdft1bk9t14qz40000gn/T/Cleanup At Startup/bls-397498141.629.py」,第227行,add_abilities jobs_dict。iteritems()] TypeError:不可用類型:'list' – goldisfine

+0

我使用了錯誤的語法 - 應爲o_code返回[{o_code:activities.keys()},jobs_dict.iteritems()中的活動''用逗號代替它看起來像一個集合文字,並且'activities.keys()'列表不能在一個集合中,因爲它是一個不可分的類型。另外,我看到我的返回語句沒有充分縮進 - 我將編輯以糾正。 –