2017-04-26 95 views
1

我試圖把URL放入一個數組中,所以我可以再次使用它們。但是,當我將數組索引0更改爲1時,我的編譯器顯示「IndexError:列表索引超出範圍」。用Python數組刮取URL

爲什麼URL只存儲在索引0中?

感謝

from urllib.parse import urlencode, urlparse, parse_qs 

from lxml.html import fromstring 
from requests import get 
raw = get("https://www.google.com/search?q=StackOverflow").text 
page = fromstring(raw) 

for result in page.cssselect(".r a"): 
    url = result.get("href") 
    if url.startswith("/url?"): 
     url = parse_qs(urlparse(url).query)['q'] 
     print(url[1]) 
+0

這意味着列表'url'中只有一個項目 – EvanL00

回答

0

你需要做一個列表收集您找到的每個網址:

from urllib.parse import urlencode, urlparse, parse_qs 
from lxml.html import fromstring 
from requests import get 

raw = get("https://www.google.com/search?q=StackOverflow").text 
page = fromstring(raw) 

# an empty list of the urls 
urls = [] 

for result in page.cssselect(".r a"): 
    url = result.get("href") 
    if url.startswith("/url?"): 
     url = parse_qs(urlparse(url).query)['q'] 
     # add this url to the list of urls 
     urls += url 

print(urls) 
0

Python是零基礎,所以你需要做的

print(url[0])

或更好,但

print(url)