2017-08-24 62 views
-1

這是我在python中的代碼。我可以提取href標籤,而不是身體內部的內容。我應該使用get()命令還是「內容」或其他方法來使用「body」?我無法使用python中的網絡爬蟲來提取標籤的正文

import requests 
from bs4 import BeautifulSoup 

def web(): 
    url='https://www.phoenixmarketcity.com/mumbai/brands' 
    source = requests.get(url) 
    plain=source.text 
    soup = BeautifulSoup(plain,"html.parser") 
    for link in soup.findAll('a'): 
     href = link.get('body') 
     print(href)  

web() 
+0

'link.getText()' – eLRuLL

回答

0

我覺得這裏是你想做的事: -

from bs4 import BeautifulSoup 
import requests 
def web(): 
    url='https://www.phoenixmarketcity.com/mumbai/brands' 
    source = requests.get(url) 
    plain=source.text 
    soup = BeautifulSoup(plain,"html.parser") 
    tags = soup('a') 
    for link in tags: 
     href = link.get('href') 
     print(href) 

    web()