2015-04-01 58 views
1
import requests 
import xml.etree.ElementTree as ET 
import re 

gen_news_list=[] 
r_milligenel = requests.get('http://www.milliyet.com.tr/D/rss/rss/Rss_4.xml') 
root_milligenel = ET.fromstring(r_milligenel.text) 

for entry in root_milligenel: 
    for channel in entry: 
     for item in channel: 
      title = re.search(".*title.*",item.tag) 
      if title: 
       gen_news_list.append(item.text) 
      link = re.search(".*link.*",item.tag) 
      if link: 
       gen_news_list.append(item.text) 
       r = requests.get(item.text) 
       print(r.text) 

我有哪些命名gen_news_list清單和我試圖要追加到此列表標題,摘要,鏈接等。但是,當我嘗試請求鏈接時發生錯誤:要求:沒有發現連接適配器,錯誤Python3

Traceback (most recent call last): 
    File "/home/deniz/Masaüstü/Çalışmalar/Python/Bot/xmlcek.py", line 23, in <module> 
    r = requests.get(item.text) 
    File "/usr/lib/python3/dist-packages/requests/api.py", line 55, in get 
    return request('get', url, **kwargs) 
    File "/usr/lib/python3/dist-packages/requests/api.py", line 44, in request 
    return session.request(method=method, url=url, **kwargs) 
    File "/usr/lib/python3/dist-packages/requests/sessions.py", line 456, in request 
    resp = self.send(prep, **send_kwargs) 
    File "/usr/lib/python3/dist-packages/requests/sessions.py", line 553, in send 
    adapter = self.get_adapter(url=request.url) 
    File "/usr/lib/python3/dist-packages/requests/sessions.py", line 608, in get_adapter 
    raise InvalidSchema("No connection adapters were found for '%s'" % url) 
requests.exceptions.InvalidSchema: No connection adapters were found for ' 
http://www.milliyet.com.tr/tbmm-baskani-cicek-programlarini/siyaset/detay/2037301/default.htm 

第一個鏈接成功運行。但第二個出錯。我無法添加內容來列出導致此錯誤的原因。這是我的循環問題嗎?代碼有什麼問題?

+0

什麼是'item.text'只是行'R = requests.get之前(item.text內容)'? – halex 2015-04-01 09:10:52

+0

你能打印導致錯誤的URL的'repr'版本嗎?我查看了產生相同錯誤的其他問題,但是這對我來說似乎是由以換行符開始的URL引起的。 – 2015-04-01 09:10:59

+0

item.text是XML標籤的內容。在鏈接的代碼中。我想請求的鏈接(「http://www.milliyet.com.tr」)。第一個鏈接運行良好。 – mehardxx 2015-04-01 09:18:34

回答

5

如果r = requests.get(item.text)你看到,開始第二次item.text具有\n開頭問題的行和這是不允許的URL年底前添加行print(repr(item.text))

'\nhttp://www.milliyet.com.tr/tbmm-baskani-cicek-programlarini/siyaset/detay/2037301/default.htm\n' 

我用repr,因爲它確實顯示了新行作爲其輸出字符串\n

到您問題的解決方法是調用item.textstrip刪除這些換行符:

r = requests.get(item.text.strip()) 
+0

它的工作原理!謝謝。但我仍然沒有明白爲什麼。 – mehardxx 2015-04-01 10:43:56

相關問題