2016-09-23 17 views
0

我想遍歷xml列表中的所有行並將它們寫入csv我需要每個元素值,如果它存在,要寫入,管道分隔到行中,否則顯示空值。我能夠創建標題行,並通過使用變量寫入第一行數據(這顯然是不正確的,但我對python非常陌生!)任何幫助都會被讚賞!順便說一句,請隨意添加任何我可以更有效或pythonic做的具體事情。我如何不在Python中的循環中聲明一個變量,當我有一個除非參數

import xml.etree.ElementTree as ET 
import sys 
import requests 
from requests_ntlm import HttpNtlmAuth 
import csv 

csv.register_dialect(
    'mydialect', 
    delimiter = '|', 
    quotechar = '"', 
    doublequote = True, 
    skipinitialspace = True, 
    lineterminator = '\n', 
    quoting = csv.QUOTE_MINIMAL) 

url="http://sharepoint/projects/urp/_vti_bin/owssvr.dll?Cmd=Display&List={8e2de4cf-79a0-4267-8b84-889a5b890b28}&XMLDATA=TRUE" 
#url="http://sharepoint/projects/urp/Lists/HITS%20Estimation%20LOE/AllItems.aspx" 
password = "#######" 
Username = "YYYY\\XXXXX" 
server_url="http://sharepoint/" 

r=requests.get(url, auth=HttpNtlmAuth(Username,password)) 
data=r.content 

tree = ET.fromstring(data) # load the string into a native XML structure 

namespaces = {'s': 'uuid:BDC6E3F0-6DA3-11d1-A2A3-00AA00C14882','dt': 'uuid:C2F41010-65B3-11d1-A29F-00AA00C14882', 'rs': 'urn:schemas-microsoft-com:rowset', 'z': '#RowsetSchema'} 

header_results = tree.findall('./s:Schema/s:ElementType/s:AttributeType', namespaces) 
row_results = tree.findall('./rs:data/z:row', namespaces) 

with open('c:\output.csv','w') as f: 
    writer = csv.writer(f, dialect='mydialect') 

#This causes the column name to be pipe delimited across the top row of the csv 
    Header_Row="" 
    for header in header_results: 
     try: 
      Header_Row += header.attrib['name']+"|" 
     except KeyError: 
      Header_Row += "NULL|" 
    writer.writerow([Header_Row]) 

#This part needs help - I need each element value, if it exists, to be written, pipe delimited into the row, or else display a null value 
#Currently this only returns one row of data because I am declaring the variable in the loop... how do I accomplish this otherwise? 
    for result in row_results: 
     try: 
      urpid = result.attrib['ows_CnELookup_x003a_URPID'] 
     except KeyError: 
      urpid = "NULL" 
     try: 
      Attachments = result.attrib['ows_Attachments'] 
     except KeyError: 
      Attachments = "NULL" 
     try: 
      Title = result.attrib['ows_LinkTitle'] 
     except KeyError: 
      Title = "NULL" 
     try: 
      Area = result.attrib['ows_Area_x0020_Name'] 
     except KeyError: 
      Area = "NULL" 
     try: 
      Group = result.attrib['ows_Group'] 
     except KeyError: 
      Group = "NULL" 
     try: 
      HITS_Hours = result.attrib['ows_HITS_x0020_Hours'] 
     except KeyError: 
      HITS_Hours = "NULL" 
     try: 
      Consult_Hours = result.attrib['ows_Consultant_x0020_Hours'] 
     except KeyError: 
      Consult_Hours = "NULL" 
     try: 
      Complete = result.attrib['ows_C_x0026_E_x0020_Completed'] 
     except KeyError: 
      Complete = "NULL" 
     try: 
      Area_Order = result.attrib['ows_Area_x0020_Order'] 
     except KeyError: 
      Area_Order = "NULL" 
    SP_Row = urpid, Attachments, Title, Area, Group, HITS_Hours, Consult_Hours, Complete, Area_Order 
    writer.writerow(SP_Row) 

回答

0

其實,如果你縮進最後兩行一層,我想你會找到你想要的東西。你在代碼中的評論提到「在循環中聲明變量」,但是Python變量沒有聲明 - 唯一的規則是它們必須在它們被使用之前被定義,這就是你正在做的。

只要是一種更加pythonic的做事方式,try: ... except KeyError:塊並不是通常的做法 - 如果您需要從字典中獲取存儲值或默認值(例如,命名爲d) ,請改爲使用value = d.get(name, default)

此外,它看起來對我來說,你的頭將在年底額外| - 我會用這個來代替:

Header_Row = [ header.attrib.get('name', 'NULL') for header in header_results ] 
    writer.writerow(Header_Row) 

代替您遍歷結果行的,我會使用以下代碼:

for results in row_results: 
     SP_ROW = [ result.attrib.get(key, 'NULL') 
        for key in [ 'ows_CnELookup_x003a_URPID', 'ows_Attachments', 
           'ows_LinkTitle', 'ows_Area_x0020_Name', 'ows_Group', 
           'ows_HITS_x0020_Hours', 'ows_Consultant_x0020_Hours', 
           'ows_C_x0026_E_x0020_Completed', 'ows_Area_x0020_Order' ] ] 
     writer.writerow(SP_ROW) 

您的上下文管理器將確保輸出文件已關閉,因此應該是您所需要的。

+0

謝謝!你絕對是位!我唯一不得不改變的是第二個關閉的方括號,在'... ows_Area_x0020_Order']鍵的末尾]再次感謝! – user6868061

+0

所以還有一個問題......如果我想使用Header_Row變量的結果來作爲循環字符串而不是硬編碼 - 我該怎麼做?我試着插入每個屬性名稱現在存在的變量.............. [Header_Row] ...............中的鍵和錯誤是TypeError:不可用類型:'list' – user6868061

+0

Header_Row已經是一個列表 - 你是否在'Header_Row'中嘗試'鍵? – cco

相關問題