2012-10-05 86 views
0

我想添加一個links屬性到每個基於csv文件中的數據的每個couchdb文檔。 的鏈接屬性的值是包含鏈接的文檔的CouchDB的_idlinkType添加一個鏈接的文檔ID的數組到python的couchdb文檔

當我運行該腳本,我得到一個鏈接錯誤(請參閱下面的錯誤信息) 我不知道類型的字典的數組如果它不存在並創建字典密鑰links,並添加鏈接數據,或者如果它存在,則會附加到links數組。

與鏈接的文檔的一個例子是這樣的:

{ 
    _id: p_3, 
    name: 'Smurfette' 
    links: [ 
       {to_id: p_2, linkType: 'knows'}, 
       {to_id: o_56, linkType: 'follows'} 
      ] 
} 

處理CSV文件的Python腳本:

#!/usr/bin/python 
# coding: utf-8 

# Version 1 
# 
# csv fields: ID,fromType,fromID,toType,toID,LinkType,Directional 


import csv, sys, couchdb 


def csv2couchLinks(database, csvfile): 

    # CouchDB Database Connection etc 
    server = couchdb.Server() 
    #assumes that couchdb runs on http://localhost:5984 
    db = server[database] 
    #assumes that db is already created 

    # CSV file 
    data = csv.reader(open(csvfile, "rb")) # Read in the CSV file rb=read/binary 
    csv_links= csv.DictReader(open(csvfile, "rb")) 


    def makeLink(from_id, to_id, linkType): 
     # get doc from db 
     doc = db[from_id] 

     # construct link object 
     link = {'to_id':to_id, 'linkType':linkType} 

     # add link reference to array at key 'links' 
     if doc['links'] in doc: 
      doc['links'].append(link) 
     else: 
      doc['links'] = [link] 

     # update the record in the database 
     db[doc.id] = doc 


    # read each row in csv file 
    for row in csv_links: 

     # get entityTypes as lowercase and entityIDs 
     fromType = row['fromType'].lower() 
     fromID = row['fromID'] 
     toType = row['toType'].lower() 
     toID  = row['toID'] 

     linkType = row['LinkType'] 

     # concatenate 'entity type' and 'id' to make couch '_id' 
     fromIDcouch = fromType[0]+'_'+fromID #eg 'p_2' <= person 2 
     toIDcouch = toType[0]+'_'+toID 

     makeLink(fromIDcouch, toIDcouch, linkType) 
     makeLink(toIDcouch, fromIDcouch, linkType) 


# Run csv2couchLinks() if this is not an imported module 
if __name__ == '__main__': 
    DATABASE = sys.argv[1] 
    CSVFILE = sys.argv[2] 
    csv2couchLinks(DATABASE,CSVFILE) 

錯誤信息:

$ python LINKS_csv2couchdb_v1.py "qmhonour" "./tablesAsCsv/links.csv" 
Traceback (most recent call last): 
    File "LINKS_csv2couchdb_v1.py", line 65, in <module> 
    csv2couchLinks(DATABASE,CSVFILE) 
    File "LINKS_csv2couchdb_v1.py", line 57, in csv2couchLinks 
    makeLink(fromIDcouch, toIDcouch, linkType) 
    File "LINKS_csv2couchdb_v1.py", line 33, in makeLink 
    if doc['links'] in doc: 
KeyError: 'links' 

回答

2

另一種選擇是冷凝if塊這樣的:

doc.setdefault('links', []).append(link) 

字典的setdefault方法檢查,看看是否存在於詞典links,如果它不它會創建一個鍵並將該值設置爲空列表(缺省值)。然後它將link附加到該列表。如果links確實存在,則只需將link附加到列表中。

def makeLink(from_id, to_id, linkType): 
    # get doc from db 
    doc = db[from_id] 

    # construct link object 
    link = {'to_id':to_id, 'linkType':linkType} 

    # add link reference to array at key 'links' 
    doc.setdefault('links', []).append(link) 

    # update the record in the database 
    db[doc.id] = doc 
+0

,這很好!並且我注意到可以對多層可能不存在的更深的嵌套結構做到這一點。例如。 'doc.setdefault('links',{}).setdefault(toType,[])。append(link)'給出像'{someKey:someValue,鏈接:{someType:[link]}}這樣的結構''' – johowie

+0

Yep ,你知道它:)這是一個非常有用的功能,肯定會從你的代碼中刪除一些行。 – RocketDonkey

1

更換:

if doc['links'] in doc: 

有了:

if 'links' in doc: 
+0

這適用於我發佈的代碼。謝謝 – johowie