2015-11-05 58 views
0

我正在嘗試從使用Python 3.4的urllib的網站讀取HTML,並遇到問題。使用什麼Python urrlib user_agent字符串?

我試圖下載一個頁面,意大利動詞「essere」的共軛。我可以訪問兩個來源:wordreference.com和verbix.com。

使用此代碼,我可以成功地從wordreference.com獲得HTML:

url = 'http://www.wordreference.com//conj//ItVerbs.aspx?v=essere' 
user_agent = 'Mozilla/5.0 (Windows NT 6.1; Win64; x64)' 
values = {'name' : 'John', 
      'location' : 'USA', 
      'language' : 'Python' } 
headers = { 'User-Agent' : user_agent} 

data = urllib.parse.urlencode(values) 
data = data.encode('utf-8') 
req = urllib.request.Request(url, data, headers) 
with urllib.request.urlopen(req) as response: 
    verbHTMLStr = response.read() 
    print(verbHTMLStr) 

如果我的Verbix.com網站更改URL的訪問

url = 'http://www.verbix.com//webverbix//Italian//essere.html' 

返回的HTML適用於www.verbix.com/languages

當複製到瀏覽器的地址欄中時,這兩個URL字符串都會返回期望的頁面。

在我看來,Verbix網站想要看到其他東西作爲user_agent,但我無法弄清楚它想要什麼。我已經嘗試了許多不同的user_agent字符串,並且都返回相同的錯誤頁面。

回答

0

給我下面的工作!

import urllib 

res=urllib.urlopen('http://www.verbix.com//webverbix//Italian//essere.html').read() 
print res 

它prints-

<!doctype html> 
<html lang="en"> 
<!-- #BeginTemplate "/Templates/verbtable_pure.dwt" --> 
<!-- DW6 --> 
<head> 
<title> 
Italian 
verb 
essere 
conjugated in all tenses.</title> 
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"> 
<meta name="keywords" content="Language,verb,Italian,essere,conjugation,conjugate"> 
<meta name="description" content="Italian verb essere conjugated in all tenses."> 
<meta name="author" content="Verbix"> 
<meta name="google" value="notranslate"> 
<link rel="stylesheet" href="/system/pure/pure-min.css"> 
<!--[if lte IE 8]> 
     <link rel="stylesheet" href="/combo/1.18.13?/css/layouts/side-menu-old-ie.css"> 
    <![endif]--> 
<!--[if gt IE 8]><!--> 
<link rel="stylesheet" href="/system/misc-pure/side-menu-verb.css"> 
<!--<![endif]--> 
<!--[if lt IE 9]> 
     <script src="http://cdnjs.cloudflare.com/ajax/libs/html5shiv/3.7/html5shiv.js"></script> 
    <![endif]--> 
<!--[if lte IE 8]> 
    <link rel="stylesheet" href="/system/pure/grids-responsive-old-ie-min.css"> 
<![endif]--> 
<!--[if gt IE 8]><!--> 
<link rel="stylesheet" href="/system/pure/grids-responsive-min.css"> 
<!--<![endif]--> 
<meta name="viewport" content="width=device-width, initial-scale=1"> 
<script type="text/javascript"> 

    var _gaq = _gaq || []; 
    _gaq.push(['_setAccount', 'UA-61929-7']); 
    _gaq.push(['_trackPageview']); 

    (function() { 
    var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; 
    ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; 
    var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); 
    })(); 

</script> 
<!-- Begin Cookie Consent plugin by Silktide - http://silktide.com/cookieconsent --> 
<script type="text/javascript"> 
    window.cookieconsent_options = {"message":"We use cookies to personalize content and ads to users, providing features for social media and analyze our traffic. We will forward information about your use of our website to social media and advertising and research companies that we work with.","dismiss":"Got it!","learnMore":"More info","link":"http://www.verbix.com/webverbix/termsofuse.html","theme":"dark-top"}; 
</script> 
<script type="text/javascript" src="//s3.amazonaws.com/cc.silktide.com/cookieconsent.latest.min.js"></script> 
<!-- End Cookie Consent plugin --> 
<!-- #BeginEditable "Head" --><!-- #EndEditable --> 

</head> 


<body> 


<div id="layout"> 
    <!-- Menu toggle --> 
    <a href="#menu" id="menuLink" class="menu-link"> 
    <!-- Hamburger icon --> 
    <span></span> </a> 
    <div id="menu"> <a href="/"><img src="/system/html5/top_left.png"/> </a> 
    <div class="pure-menu"> <a class="pure-menu-heading" href="/languages">Online</a> 
     <ul class="pure-menu-list"> 
     <li class="pure-menu-item"><a href="/languages" class="pure-menu-link">Verb Conjugator</a></li> 
     <li class="pure-menu-item"><a href="/translate/" class="pure-menu-link">Verb Translation</a></li> 
     <li class="pure-menu-item"><a href="/find-verb/" class="pure-menu-link">Find Verb</a></li> 
     <li class="pure-menu-item"><a href="/games/" class="pure-menu-link">Games</a></li> 
     <li class="pure-menu-item"><a href="/maps/" class="pure-menu-link">Language Maps</a></li> 
     </ul> 
     <a class="pure-menu-heading" href="/windowsverbix/">Windows</a> 
     <ul class="pure-menu-list"> 
     <li class="pure-menu-item"><a href="/windowsverbix/" class="pure-menu-link">Verbix for Windows</a></li> 
     <li class="pure-menu-item"><a href="/download/" class="pure-menu-link">Download</a></li> 
     <li class="pure-menu-item"><a href="/store/" class="pure-menu-link">Store</a></li> 
     </ul> 
     <a class="pure-menu-heading" href="/wizard/">For Webmasters</a> 
     <ul class="pure-menu-list"> 
     <li class="pure-menu-item"><a href="/wizard/" class="pure-menu-link">Your Own Conjugator</a></li> 
     </ul> 
     <a class="pure-menu-heading" href="/webverbix/termsofuse.html">About ...</a> </div> 
    </div> 
    <div id="main"> 
    <div class="header"> 
     <h1> 
     Italian 
     : 
     essere 
     </h1> 
     <h2> 
     Italian 
     verb ' 
     essere 
     ' conjugated in all tenses</h2> 
    </div> 
    <div class="advertising"> 
     <script async src="//pagead2.googlesyndication.com/pagead/js/adsbygoogle.js"></script> 
     <!-- MainLeftReactive --> 
     <ins class="adsbygoogle" 
    style="display:block" 
    data-ad-client="ca-pub-3716807887832772" 
    data-ad-slot="9886612560" 
    data-ad-format="auto"></ins> 
     <script> 
(adsbygoogle = window.adsbygoogle || []).push({}); 
</script> 
    </div> 
    <div class="verbcontent"> 
     <p><a href="http://www.verbix.com/languages/italian.shtml" rel="prev">Conjugate another 
     Italian 
     verb</a> 
     <!-- AddThis Button BEGIN --> 
     <script type="text/javascript">var addthis_pub="verbix";</script> 
     <script type="text/javascript">var addthis_config = {services_exclude: 'print',data_ga_property: 'UA-61929-7', data_track_clickback: true}</script> 
     <a href="http://www.addthis.com/bookmark.php?v=20" onMouseOver="return addthis_open(this, '', '[URL]', '[TITLE]')" onMouseOut="addthis_close()" onClick="return addthis_sendto()"><img src="http://s7.addthis.com/static/btn/lg-share-en.gif" width="125" height="16" alt="Bookmark and Share" style="border:0"/></a> 
     <script type="text/javascript" src="http://s7.addthis.com/js/200/addthis_widget.js"></script> 
     <!-- AddThis Button END --> 
     </p> 

     <div class="pure-g verbtable"> <!-- #BeginEditable "Full_width_text" --> 
     <div class="pure-u-1-1"> 
      <h2>Nominal Forms</h2> 
      <div class="pure-g"> 
      <div class="pure-u-1-3"> 
       <p><b>Infinito:<br>Participio presente:<br>Gerundio:<br>Participio passato:</b></p> 
      </div> 
      <div class="pure-u-1-3"> 
       <p><span class="normal">essere</span><br> 
<span class="normal">essente</span><br> 
<span class="normal">essendo</span><br> 
<span class="irregular">stato</span><br> 
</p> 
      </div> 
      <div class="pure-u-1-3"> 
       <p> 
       <span class="normal">avere stato</span><br> 


       <span class="normal">avendo stato</span><br> 

       <span class="normal">avente stato</span><br> 

       </p> 
      </div> 
      </div> 
     </div> 
     <div class="pure-u-1-1 pure-u-lg-1-2"> 
      <h2>Indicativo</h2> 
      <div class="pure-g"> 
      <div class="pure-u-1-2"> 
       <h3>Presente</h3> 
       <p> 
       <font color=#007F00 face=Courier><span class="normal">io</span>&nbsp;&nbsp;&nbsp;</font><span class="irregular">sono</span><br> 
<font color=#007F00 face=Courier><span class="normal">tu</span>&nbsp;&nbsp;&nbsp;</font><span class="irregular">sei</span><br> 
<font color=#007F00 face=Courier><span class="normal">lui</span>&nbsp;&nbsp;</font><span class="irregular">&egrave;</span><br> 
<font color=#007F00 face=Courier><span class="normal">noi</span>&nbsp;&nbsp;</font><span class="irregular">siamo</span><br> 
<font color=#007F00 face=Courier><span class="normal">voi</span>&nbsp;&nbsp;</font><span class="irregular">siete</span><br> 
<font color=#007F00 face=Courier><span class="normal">loro</span>&nbsp;</font><span class="irregular">sono</span><br> 

       </p> 
      </div> 
      <div class="pure-u-1-2"> 
       <h3>Passato prossimo</h3> 
       <p> 
       <font color=#007F00 face=Courier><span class="normal">io</span>&nbsp;&nbsp;&nbsp;</font><span class="normal">ho stato</span><br> 
<font color=#007F00 face=Courier><span class="normal">tu</span>&nbsp;&nbsp;&nbsp;</font><span class="normal">hai stato</span><br> 
<font color=#007F00 face=Courier><span class="normal">lui</span>&nbsp;&nbsp;</font><span class="normal">ha stato</span><br> 
<font color=#007F00 face=Courier><span class="normal">noi</span>&nbsp;&nbsp;</font><span class="normal">abbiamo stato</span><br> 
<font color=#007F00 face=Courier><span class="normal">voi</span>&nbsp;&nbsp;</font><span class="normal">avete stato</span><br> 
<font color=#007F00 face=Courier><span class="normal">loro</span>&nbsp;</font><span class="normal">hanno stato</span><br> 

       </p> 
      </div> 
      <div class="pure-u-1-2"> 
       <h3>Imperfetto</h3> 
       <p> 
       <font color=#007F00 face=Courier><span class="normal">io</span>&nbsp;&nbsp;&nbsp;</font><span class="irregular">ero</span><br> 
<font color=#007F00 face=Courier><span class="normal">tu</span>&nbsp;&nbsp;&nbsp;</font><span class="irregular">eri</span><br> 
<font color=#007F00 face=Courier><span class="normal">lui</span>&nbsp;&nbsp;</font><span class="irregular">era</span><br> 
<font color=#007F00 face=Courier><span class="normal">noi</span>&nbsp;&nbsp;</font><span class="irregular">eravamo</span><br> 
<font color=#007F00 face=Courier><span class="normal">voi</span>&nbsp;&nbsp;</font><span class="irregular">eravate</span><br> 
<font color=#007F00 face=Courier><span class="normal">loro</span>&nbsp;</font><span class="irregular">erano</span><br> 

       </p> 
      </div> 
      <div class="pure-u-1-2"> 
       <h3>Trapassato prossimo</h3> 
       <p> 
       <font color=#007F00 face=Courier><span class="normal">io</span>&nbsp;&nbsp;&nbsp;</font><span class="normal">avevo stato</span><br> 
<font color=#007F00 face=Courier><span class="normal">tu</span>&nbsp;&nbsp;&nbsp;</font><span class="normal">avevi stato</span><br> 
<font color=#007F00 face=Courier><span class="normal">lui</span>&nbsp;&nbsp;</font><span class="normal">aveva stato</span><br> 
<font color=#007F00 face=Courier><span class="normal">noi</span>&nbsp;&nbsp;</font><span class="normal">avevamo stato</span><br> 
<font color=#007F00 face=Courier><span class="normal">voi</span>&nbsp;&nbsp;</font><span class="normal">avevate stato</span><br> 
<font color=#007F00 face=Courier><span class="normal">loro</span>&nbsp;</font><span class="normal">avevano stato</span><br> 

       </p> 
      </div> 
      <div class="pure-u-1-2"> 
       <h3>Futuro</h3> 
       <p> 
       <font color=#007F00 face=Courier><span class="normal">io</span>&nbsp;&nbsp;&nbsp;</font><span class="irregular">sar&ograve;</span><br> 
<font color=#007F00 face=Courier><span class="normal">tu</span>&nbsp;&nbsp;&nbsp;</font><span class="irregular">sarai</span><br> 
<font color=#007F00 face=Courier><span class="normal">lui</span>&nbsp;&nbsp;</font><span class="irregular">sar&agrave;</span><br> 
<font color=#007F00 face=Courier><span class="normal">noi</span>&nbsp;&nbsp;</font><span class="irregular">saremo</span><br> 
<font color=#007F00 face=Courier><span class="normal">voi</span>&nbsp;&nbsp;</font><span class="irregular">sarete</span><br> 
<font color=#007F00 face=Courier><span class="normal">loro</span>&nbsp;</font><span class="irregular">saranno</span><br> 

       </p> 
      </div> 
      <div class="pure-u-1-2"> 
       <h3>Futuro anteriore</h3> 
       <p> 
       <font color=#007F00 face=Courier><span class="normal">io</span>&nbsp;&nbsp;&nbsp;</font><span class="normal">avr&ograve; stato</span><br> 
<font color=#007F00 face=Courier><span class="normal">tu</span>&nbsp;&nbsp;&nbsp;</font><span class="normal">avrai stato</span><br> 
<font color=#007F00 face=Courier><span class="normal">lui</span>&nbsp;&nbsp;</font><span class="normal">avr&agrave; stato</span><br> 
<font color=#007F00 face=Courier><span class="normal">noi</span>&nbsp;&nbsp;</font><span class="normal">avremo stato</span><br> 
<font color=#007F00 face=Courier><span class="normal">voi</span>&nbsp;&nbsp;</font><span class="normal">avrete stato</span><br> 
<font color=#007F00 face=Courier><span class="normal">loro</span>&nbsp;</font><span class="normal">avranno stato</span><br> 

       </p> 
      </div> 
      <div class="pure-u-1-2"> 
       <h3>Passato remoto</h3> 
       <p> 
       <font color=#007F00 face=Courier><span class="normal">io</span>&nbsp;&nbsp;&nbsp;</font><span class="irregular">fui</span><br> 
<font color=#007F00 face=Courier><span class="normal">tu</span>&nbsp;&nbsp;&nbsp;</font><span class="irregular">fosti</span><br> 
<font color=#007F00 face=Courier><span class="normal">lui</span>&nbsp;&nbsp;</font><span class="irregular">fu</span><br> 
<font color=#007F00 face=Courier><span class="normal">noi</span>&nbsp;&nbsp;</font><span class="irregular">fummo</span><br> 
<font color=#007F00 face=Courier><span class="normal">voi</span>&nbsp;&nbsp;</font><span class="irregular">foste</span><br> 
<font color=#007F00 face=Courier><span class="normal">loro</span>&nbsp;</font><span class="irregular">furono</span><br> 

       </p> 
      </div> 
      <div class="pure-u-1-2"> 
       <h3>Trapassato remoto</h3> 
       <p> 
       <font color=#007F00 face=Courier><span class="normal">io</span>&nbsp;&nbsp;&nbsp;</font><span class="normal">ebbi stato</span><br> 
<font color=#007F00 face=Courier><span class="normal">tu</span>&nbsp;&nbsp;&nbsp;</font><span class="normal">avesti stato</span><br> 
<font color=#007F00 face=Courier><span class="normal">lui</span>&nbsp;&nbsp;</font><span class="normal">ebbe stato</span><br> 
<font color=#007F00 face=Courier><span class="normal">noi</span>&nbsp;&nbsp;</font><span class="normal">avemmo stato</span><br> 
<font color=#007F00 face=Courier><span class="normal">voi</span>&nbsp;&nbsp;</font><span class="normal">aveste stato</span><br> 
<font color=#007F00 face=Courier><span class="normal">loro</span>&nbsp;</font><span class="normal">ebbero stato</span><br> 

       </p> 
      </div> 
      </div> 
     </div> 
     <div class="pure-u-1-1 pure-u-lg-1-2"> 
      <h2>Congiuntivo</h2> 
      <div class="pure-g"> 
      <div class="pure-u-1-2"> 
       <h3>Presente</h3> ........................... 
+0

這不,事實上,當與Python 2.7運行,但給出了這樣的錯誤時,3.4運行工作:AttributeError的: '模塊' 對象有沒有屬性 '的urlopen' 顯然我希望瞭解的兩個版本的庫有所不同。 – johnz

+0

哎呀 - 忘了感謝SIslam的回答......謝謝! – johnz

相關問題