如何優雅地解決Python KeyError（Python csv庫）

我已經使用lxml和JSON庫在Python中編寫了一個基本的網頁抓取工具。下面的代碼片段的細節如何我目前正在寫CSV：如何優雅地解決Python KeyError（Python csv庫）

with open(filepath, "ab") as f: 

       write = csv.writer(f) 

       try: 
        write.writerow(["allhomes", 
            statenum, 
            statesubnum, 
            suburbnum, 
            listingnum, 
            listingsurlstr, 
            '', # fill this in! should be 'description' 
            node["state"], 
            node["suburb"], 
            node["postcode"], 
            node["propertyType"], 
            node["bathrooms"], 
            node["bedrooms"], 
            node["parking"], 
            pricenode, 
            node["photoCount"], 
            node2["pricemin"], 
            node2["pricemax"], 
            node2["pricerange"]]) 
       except KeyError, e: 
        try: 
         write.writerow(["allhomes", 
             statenum, 
             statesubnum, 
             suburbnum, 
             listingnum, 
             listingsurlstr, 
             '', # fill this in! should be 'description' 
             node["state"], 
             node["suburb"], 
             node["postcode"], 
             node["propertyType"], 
             '', 
             node["bedrooms"], 
             node["parking"], 
             pricenode, 
             node["photoCount"], 
             node2["pricemin"], 
             node2["pricemax"], 
             node2["pricerange"]]) 
        except KeyError, e: 
          errorcount += 1 
          with open(filepath, "ab"): # 
           write = csv.writer(f) 
           write.writerow(["Error: invalid dictionary field key: %s" % e.args, 
               statenum, 
               statesubnum, 
               suburbnum, 
               listingnum, 
               listingsurlstr]) 
        pass 
       pass

的問題是，如果某個節點不存在（最常見的是浴室節點），我有一個更換浴室節點再試一次空白值，或者隨後放棄整行數據。我目前的做法是再次嘗試並通過刪除Bathrooms節點來寫入該行，但這很麻煩（並且未修復其他節點的KeyErrors）。

如果不存在或不包含任何數據而又不犧牲整個條目，我該如何在這種情況下編寫單個節點？

非常感謝。

來源

2016-06-27 doubleknavery

任何大小的網頁抓取幾乎總是會導致數據混亂，看來似乎。有沒有一種方法可以避免將代碼中的某些鍵匹配到某處？ – Jeff

哈哈這是真的。你幾乎肯定是對的 - 我只是無法找到一個好的，可重複的方法來做到這一點 – doubleknavery

'node'是一個字典嗎？如果是這樣，你可以使用[get]（https://docs.python.org/3.5/library/stdtypes.html#dict.get） – user3220892

如果你不得不使用這樣的鍵，過去我使用網頁抓取的一種方法是創建一個處理錯誤的包裝器，然後返回值。

def get_node(name, node): 
    try: 
     val = node[name] 
    except KeyError: 
     val = 'na' 
    return val 

write.writerow(['allhomes', 
       get_node('bathrooms', node), 
       ... 
       ])

來源

2016-06-27 03:21:46 Jeff

這工作得很好。感謝您的投入Jeff。 – doubleknavery

如何優雅地解決Python KeyError（Python csv庫）

回答

相關問題