2017-01-26 104 views
0

我有一段代碼訪問link s,並試圖在每個link中找到某些keywords數據類型不匹配beautifulsoup TypeError:不可用類型:'list'

最後,keywords的,如果link一個或多個它存儲在list

然而,當我跑我的代碼它給我的問題: TypeError: unhashable type: 'list'在這條線:

for a in soup.find_all('a', class_="result-title hdrlnk", text=re.compile(job_kw,re.IGNORECASE)): 

下面是代碼:

jobs_by_city = [ 
'http://boston.website.org/search/widget', 
] 

job_kw = [['web site','user', 'account'],['permission', 'name']] 
job_kw = sum(job_kw, []) 

jobs = [] 

for job_in_city in jobs_by_city: 
    a_job = requests.get(job_in_city) 
    soup = BeautifulSoup(a_job.text, "lxml") 
    for a in soup.find_all('a', class_="result-title hdrlnk", text=re.compile(job_kw,re.IGNORECASE)): 
     print(a.get('href')) 
     #jobs.append(a.get('href')) 

什麼我錯在這裏做什麼?

+0

其中「美麗的湯」版本的HTML元素? 're.compile'不會將列表作爲模式。我想你可以通過列表作爲'text'參數。在BS v4中,您可以將列表傳遞給'string'參數。 – Himal

回答

0

re.compile不需要list作爲輸入。你必須遍歷關鍵詞:

from bs4 import BeautifulSoup 
import requests 
import re 

jobs_by_city = [ 
'http://boston.website.org/search/widget', 
] 

job_kws = [['web site','user', 'account'],['permission', 'name']] 
job_kws = sum(job_kws, []) 

jobs = [] 

for job_in_city in jobs_by_city: 
    a_job = requests.get(job_in_city) 
    soup = BeautifulSoup(a_job.text, "lxml") 
    for job_kw in job_kws: 
     for a in soup.find_all('a', class_="result-title hdrlnk", text=re.compile(job_kw,re.IGNORECASE)): 
      print(a.get('href')) 
      #jobs.append(a.get('href')) 

指定的網址不提供你您使用尋找:)