在搜索結果中找到鏈接列表

我試圖從庫頁面中搜索結果。但是由於我不僅需要書名，而且還希望腳本打開每個搜索結果並抓取詳細網站以獲取更多信息。
我有什麼到目前爲止是這樣的：在搜索結果中找到鏈接列表

import bs4 as bs 
    import urllib.request, urllib.error, urllib.parse 
    from http.cookiejar import CookieJar 
    from bs4 import Comment 


    cj = CookieJar() 
    basisurl = 'http://mz-villigst.cidoli.de/index.asp?stichwort=hans' 
    #just took any example page similar to the one i have in mind 

    opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj)) 
    p = opener.open(basisurl) 

    for mednrs in soup.find_all(string=lambdatext:isinstance(text,Comment)): 
    #and now when i do [0:] it gives me the medianumbers and i can create the links like this: 

      links = 'http://mz-villigst.cidoli.de/index.asp?MEDIENNR=' + mednrs[10:17]

我的主要問題是現在：我怎樣才能得到它給我的列表（例如：[「1」，「2」] ... ）我可以通過嗎？

來源

2017-08-16 holmix

我不明白你當前的代碼。什麼是「評論」？ –

抱歉，我的意思是mednrs，而不是... – holmix

創建一個列表，並追加到它在循環中：

links = [] 
for mednrs in soup.find_all(string=lambda text: isinstance(text, Comment)): 
    link = 'http://mz-villigst.cidoli.de/index.asp?MEDIENNR=' + mednrs[10:17] 
    links.append(link)

或者使用列表理解：

links = ['http://mz-villigst.cidoli.de/index.asp?MEDIENNR=' + mednrs[10:17] 
     for mednrs in soup.find_all(string=lambda text: isinstance(text, Comment))]

來源

2017-08-16 11:28:45

不錯！謝謝！第一個工作得很好！ – holmix

@holmix：如果這回答了您的問題，那麼您應該將其標記爲「已接受」。 –

在搜索結果中找到鏈接列表

回答

相關問題