2017-10-05 45 views
0

我想寫的Python:XML從URL中檢索到CSV

的XML的格式如下Python腳本,從一個網址動態讀取XML數據(例如http://www.wrh.noaa.gov/mesowest/getobextXml.php?sid=KCQT&num=72):

<station id="KCQT" name="Los Angeles/USC Campus Downtown" elev="179" lat="34.02355" lon="-118.29122" provider="NWS/FAA"> 
<ob time="04 Oct 7:10 pm" utime="1507169400"> 
<variable var="T" description="Temp" unit="F" value="61"/> 
<variable var="TD" description="Dewp" unit="F" value="39"/> 
<variable var="RH" description="Relh" unit="%" value="45"/> 
</ob> 
<ob time="04 Oct 7:05 pm" utime="1507169100"> 
<variable var="T" description="Temp" unit="F" value="61"/> 
<variable var="TD" description="Dewp" unit="F" value="39"/> 
<variable var="RH" description="Relh" unit="%" value="45"/> 
</ob> 
<ob time="04 Oct 7:00 pm" utime="1507168800"> 
<variable var="T" description="Temp" unit="F" value="61"/> 
<variable var="TD" description="Dewp" unit="F" value="39"/> 
<variable var="RH" description="Relh" unit="%" value="45"/> 
</ob> 
<ob time="04 Oct 6:55 pm" utime="1507168500"> 
<variable var="T" description="Temp" unit="F" value="61"/> 
<variable var="TD" description="Dewp" unit="F" value="39"/> 
<variable var="RH" description="Relh" unit="%" value="45"/> 
</ob> 
</station> 

我只想檢索所有可用日期的時間戳和小數溫度(「Temp」)(這裏有4個以上)。

輸出應該是一個CSV格式的文本文件,其中時間戳和溫度值每行打印一對。

下面是我的代碼(這是可怕的,並沒有在所有的工作)的嘗試:

import requests 

weatherXML = requests.get("http://www.wrh.noaa.gov/mesowest/getobextXml.php?sid=KCQT&num=72") 

import xml.etree.ElementTree as ET 
import csv 

tree = ET.parse(weatherXML) 
root = tree.getroot() 

# open file for writing 
Time_Temp = open('timestamp_temp.csv', 'w') 

#csv writer object 
csvwriter = csv.writer(Time_Temp) 
time_temp = [] 

count = 0 
for member in root.findall('ob'): 
    if count == 0: 
     temperature = member.find('T').var 
     time_temp.append(temperature) 
     csvwriter.writerow(time_temp) 
     count = count + 1 

    temperature = member.find('T').text 
    time_temp.append(temperature) 

Time_Temp.close() 

請幫助。

+0

我怎麼沒看到「時間的年,月,日,分,秒和區偏移」在XML中表示文件。 –

+0

@BillBell對不起,我編輯了這個要求。時間戳現在將遵循xml文件中表示的格式。謝謝。 – WandaW

+0

「沒有工作」......你得到了什麼錯誤?它應該已經炸掉了,只是解析文件。改爲使用'ET.fromstring(weatherXML.text)'。 –

回答

0

假設PYT本3,這將工作。我注意到,如果需要的Python 2的區別:

import xml.etree.ElementTree as ET 
import requests 
import csv 

weatherXML = requests.get("http://www.wrh.noaa.gov/mesowest/getobextXml.php?sid=KCQT&num=72") 
root = ET.fromstring(weatherXML.text) 

# Use this with Python 2 
# with open('timestamp_temp.csv','wb') as Time_Temp: 

with open('timestamp_temp.csv','w',newline='') as Time_Temp: 
    csvwriter = csv.writer(Time_Temp) 
    csvwriter.writerow(['Time','Temp']) 
    for member in root.iterfind('ob'): 
     date = member.attrib['time'] 
     temp = member.find("variable[@var='T']").attrib['value'] 
     csvwriter.writerow([date,temp]) 

輸出:

Time,Temp 
04 Oct 11:47 pm,65 
04 Oct 10:47 pm,66 
04 Oct 9:47 pm,68 
04 Oct 8:47 pm,68 
04 Oct 7:47 pm,68 
04 Oct 6:47 pm,70 
04 Oct 5:47 pm,74 
04 Oct 4:47 pm,75 
    . 
    . 
0

可以遍歷元素ob第一,獲得元素ob的屬性time,並查找其varT元素變量,並獲得元素value溫度,它們添加到列表,並將其寫入到CSV文件:

import xml.etree.ElementTree as ET 
import csv 
tree = ET.parse('getobextXml.php.xml') 
root = tree.getroot() 
# open file for writing 
with open('timestamp_temp.csv', 'wb') as csvfile: 
    csvwriter = csv.writer(csvfile) 
    csvwriter.writerow(["Time","Temp"]) 
    for ob in root.iter('ob'): 
     time_temp = [] 
     timestamp = ob.get('time') #get the attribute time of element ob 
     temp = ob.find("./variable[@var='T']").get('value') #find element variable which var is T, and get the element value 
     time_temp.append(timestamp) 
     time_temp.append(temp) 
     csvwriter.writerow(time_temp) 

後,你可以找到timestamp_temp.csv會給你的結果:

Time,Temp 
04 Oct 8:47 pm,68 
04 Oct 7:47 pm,68 
04 Oct 6:47 pm,70 
04 Oct 5:47 pm,74 
04 Oct 4:47 pm,75 
04 Oct 3:47 pm,75 
04 Oct 2:47 pm,77 
04 Oct 1:47 pm,78 
04 Oct 12:47 pm,78 
04 Oct 11:47 am,76 
04 Oct 10:47 am,74 
04 Oct 9:47 am,72 
...