2014-02-11 84 views
5

我試圖從nominatim到geo-code幾千個城市得到響應。Xml從網絡響應中解析

import os 
import requests 
import xml.etree.ElementTree as ET 

txt = open('input.txt', 'r').readlines() 
for line in txt: 
lp, region, district, municipality, city = line.split('\t') 
baseUrl = 'http://nominatim.openstreetmap.org/search/gb/'+region+'/'+district+'/'+municipality+'/'+city+'/?format=xml' 
# eg. http://nominatim.openstreetmap.org/search/pl/podkarpackie/stalowowolski/Bojan%C3%B3w/Zapu%C5%9Bcie/?format=xml 
resp = requests.get(baseUrl) 
resp.encoding = 'UTF-8' # special diacritics 
msg = resp.text 
# parse response to get lat & long 
tree = ET.parse(msg) 
root = tree.getroot() 
print tree 

但結果是:

Traceback (most recent call last): 
File "geo_miasta.py", line 17, in <module> 
    tree = ET.parse(msg) 
File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1182, in parse 
    tree.parse(source, parser) 
File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 647, in parse 
    source = open(source, "rb")  
IOError: [Errno 2] No such file or directory: u'<?xml version="1.0" encoding="UTF-8" ?>\n<searchresults timestamp=\'Tue, 11 Feb 14 21:13:50 +0000\' attribution=\'Data \xa9 OpenStreetMap contributors, ODbL 1.0. http://www.openstreetmap.org/copyright\' querystring=\'\u015awierczyna, Drzewica, opoczy\u0144ski, \u0142\xf3dzkie, gb\' polygon=\'false\' more_url=\'http://nominatim.openstreetmap.org/search?format=xml&amp;exclude_place_ids=&amp;q=%C5%9Awierczyna%2C+Drzewica%2C+opoczy%C5%84ski%2C+%C5%82%C3%B3dzkie%2C+gb\'>\n</searchresults>' 

有什麼不對呢?

編輯: 吳丹到@rob我的解決方案是:

#! /usr/bin/env python2.7 
# -*- coding: utf-8 -*- 

import os 
import requests 
import xml.etree.ElementTree as ET 

txt = open('input.txt', 'r').read().split('\n') 

for line in txt: 
    lp, region, district, municipality, city = line.split('\t') 
    baseUrl = 'http://nominatim.openstreetmap.org/search/pl/'+region+'/'+district+'/'+municipality+'/'+city+'/?format=xml' 
    resp = requests.get(baseUrl) 
    msg = resp.content 
    tree = ET.fromstring(msg) 
    for place in tree.findall('place'): 
    location = '{:5f}\t{:5f}'.format(
     float(place.get('lat')), 
     float(place.get('lon'))) 

    f = open('result.txt', 'a') 
    f.write(location+'\t'+region+'\t'+district+'\t'+municipality+'\t'+city) 
    f.close() 

回答

6

您正在使用xml.etree.ElementTree.parse(),這需要一個文件名或文件對象作爲參數。但是,您不傳遞文件或文件對象,您傳遞的是unicode字符串。

嘗試xml.etree.ElementTree.fromstring(text)

像這樣:

tree = ET.fromstring(msg) 

下面是一個完整的示例程序:

import os 
import requests 
import xml.etree.ElementTree as ET 

baseUrl = 'http://nominatim.openstreetmap.org/search/pl/podkarpackie/stalowowolski/Bojan%C3%B3w/Zapu%C5%9Bcie\n/?format=xml' 
resp = requests.get(baseUrl) 
msg = resp.content 
tree = ET.fromstring(msg) 
for place in tree.findall('place'): 
    print u'{:s}: {:+.2f}, {:+.2f}'.format(
    place.get('display_name'), 
    float(place.get('lon')), 
    float(place.get('lat'))).encode('utf-8') 
+0

謝謝,這個移動的錯誤邊境編碼空間:'UnicodeEncodeError: 'ASCII' 編解碼器」 t在位置115編碼字符u'\ xa9':序號不在範圍內(128)' – m93

+0

@ m93 - 這是因爲您正在使用'resp.text'而不是'resp.content'。看到我的編輯完整的程序,應該讓你開始。 –

+0

你說得對。這個示例有效。謝謝。 – m93

0
import os,sys,time 
import xml.etree.ElementTree as ET 
from xml.etree.ElementTree import parse 
tree = ET.parse('D:\Reddy\BankLoanAcctService_transactionInq.xml') 
root=tree.getroot() 

for TrxnEffDt in root.iter('TrxnEffDt'): 
new_TrxnEffDt= str(time.strftime("%y-%m-%d")) 
TrxnEffDt=str(new_TrxnEffDt) 

filename2 ="D:\Reddy\BankLoanAcctService_transactionInq2.txt" 
r=open(filename2,'w') 
sys.stdout =r 
+0

回溯(最近一次調用最後一次): 文件「D:\ Reddy \ Python \ new.py」,第4行,在 tree = ET.parse('D:\ Reddy \ BankLoanAcctService_transactionInq.xml') File「 C:\ Python33 \ lib \ xml \ etree \ ElementTree.py「,行1242,解析爲 tree.parse(source,parser) 文件」C:\ Python33 \ lib \ xml \ etree \ ElementTree.py「 1730,in parse self._root = parser._parse(source) 文件「」,行無 xml.etree.ElementTree.ParseError:語法錯誤:行1,列0 – user5493252

+0

這是我收到的錯誤消息。請幫助我 – user5493252

+0

您應該問自己的問題,而不是使用關於此主題的答案(以避免混淆,並確保爲您的具體問題找到解決方案),並且,如果您認爲問題是相關的,則應該將鏈接到這個幫助回答者。 – Tiesselune