2013-07-07 180 views
-1

源代碼我有以下代碼:獲取網址

import urllib2 
from itertools import product 

with open('urllist.txt') as urllist: 
    urls=[line.strip() for line in urllist] 

for url in product(urls): 
    usock = urllib2.urlopen(url) 
    data = usock.read() 
    usock.close() 
    sourcecode=open('./sourcecode', 'w+') 
    sourcecode.write(data) 

當我運行它,它給了:

Traceback (most recent call last): 
    File "12.py", line 8, in <module> 
    usock = urllib2.urlopen(url) 
    File "/opt/python2.7.1/lib/python2.7/urllib2.py", line 126, in urlopen 
    return _opener.open(url, data, timeout) 
    File "/opt/python2.7.1/lib/python2.7/urllib2.py", line 383, in open 
    req.timeout = timeout 
AttributeError: 'tuple' object has no attribute 'timeout' 

不知道如何解決它?非常感謝!

+4

那你打算使用'product'實現? –

+0

我想從網址列表中獲取源代碼。 – Tom

+0

'url'如何看起來像? – matino

回答

3

itertools.product返回一個元組不是項目本身:

>>> from itertools import product 
>>> lis = ['a','b','c'] 
>>> for p in product(lis): 
...  print p 
...  
('a',) 
('b',) 
('c',) 

使用過的URL的簡單循環:

for url in urls: 
    usock = urllib2.urlopen(url) 
+0

謝謝!我已經想出了另一種方式來做到這一點。只需將產品(網址)中的網址更改爲「網址」中的「url」即可:「 – Tom

+2

@Tom我已經在答案中提到過了。 –