用Python數組刮取URL

2017-04-26 118 views 1 likes

我試圖把URL放入一個數組中，所以我可以再次使用它們。但是，當我將數組索引0更改爲1時，我的編譯器顯示「IndexError：列表索引超出範圍」。用Python數組刮取URL

爲什麼URL只存儲在索引0中？

感謝

from urllib.parse import urlencode, urlparse, parse_qs 

from lxml.html import fromstring 
from requests import get 
raw = get("https://www.google.com/search?q=StackOverflow").text 
page = fromstring(raw) 

for result in page.cssselect(".r a"): 
    url = result.get("href") 
    if url.startswith("/url?"): 
     url = parse_qs(urlparse(url).query)['q'] 
     print(url[1])

來源

2017-04-26 John Jackson

這意味着列表'url'中只有一個項目 – EvanL00

回答

你需要做一個列表收集您找到的每個網址：

from urllib.parse import urlencode, urlparse, parse_qs 
from lxml.html import fromstring 
from requests import get 

raw = get("https://www.google.com/search?q=StackOverflow").text 
page = fromstring(raw) 

# an empty list of the urls 
urls = [] 

for result in page.cssselect(".r a"): 
    url = result.get("href") 
    if url.startswith("/url?"): 
     url = parse_qs(urlparse(url).query)['q'] 
     # add this url to the list of urls 
     urls += url 

print(urls)

來源

2017-04-26 02:13:12 Hazzles

Python是零基礎，所以你需要做的

print(url[0])

或更好，但

print(url)

來源

2017-04-26 01:56:15 jprockbelly

相關問題

1. 使用python刮取數據
2. 用CsQuery刮取JavaScript數組數據
3. Python刮臉URL問題
4. 刮：刮URL
5. 用Python從Facebook上刮取數據
6. 使用Python的網頁刮取數據
7. UnicodeEncodeError：使用Python和beautifulsoup4刮取數據
8. 如何用Python刮取XML？
9. 從多個URL中刮取數據
10. 使用python來刮取url中帶有id的ASP.NET網站

11. 刮：嵌套的URL數據刮
12. 用Python颳去Web數據
13. 在URL中查找特定的框架以使用Python刮取數據BeautifulSoup
14. 用Python刮Ajax
15. 用Python刮？
16. 使用Python刮取Web應用程序
17. Python數據刮板
18. 試圖從Python代碼中使用python刮取數據
19. 從ActiveX組件中刮取數據
20. Python硒刮失URL的圖像
21. 使用Rcurl刮取數據
22. 用Python和Selen刮取Javascript文本
23. 使用Python和urllib2刮取ASP.NET
24. 用python刮取AJAX加載的內容？
25. 用BeautifulSoup和Python刮取Javascript網頁
26. 用Python vs PHP刮取網頁？
27. 從Javascript數組刮？
28. 用Python 3.6刮Duckduckgo
29. 用Python刮寫Javascript
30. 用Python刮臉Facebook