0
抱歉轉載此問題。有人將問題遷移到不同的網站,而沒有我無法評論或編輯的cookies。使用bs4/python3提取href? (再次)
我是新來的蟒蛇和bs4,請容易對我。
#!/usr/bin/python3
import bs4 as bs
import urllib.request
import time, datetime, os, requests, lxml.html
import re
from fake_useragent import UserAgent
url = "https://www.cvedetails.com/vulnerability-list.php"
ua = UserAgent()
header = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36'}
snkr = requests.get(url,headers=header)
soup = bs.BeautifulSoup(snkr.content,'lxml')
for item in soup.find_all('tr', class_="srrowns"):
print(item.td.next_sibling.next_sibling.a)
打印:
<a href="/cve/CVE-2017-6712/" title="CVE-2017-6712 security vulnerability details">CVE-2017-6712</a>
<a href="/cve/CVE-2017-6708/" title="CVE-2017-6708 security vulnerability details">CVE-2017-6708</a>
<a href="/cve/CVE-2017-6707/" title="CVE-2017-6707 security vulnerability details">CVE-2017-6707</a>
<a href="/cve/CVE-2017-1269/" title="CVE-2017-1269 security vulnerability details">CVE-2017-1269</a>
<a href="/cve/CVE-2017-0711/" title="CVE-2017-0711 security vulnerability details">CVE-2017-0711</a>
<a href="/cve/CVE-2017-0706/" title="CVE-2017-0706 security vulnerability details">CVE-2017-0706</a>
使用recommened字符串:
print(item.td.next_sibling.next_sibling.a.href)
打印:
None
None
None
None
None
None
無法弄清楚如何提取/cve/CVE-2017-XXXX/
部分。也許我錯過了這個方向。我不需要標題或html,只需要uri。
它'[ 'href' 屬性]''不... .href'在BS的想法是'.tag'和'[ '屬性'] '。 –