'NoneType' 對象有沒有屬性 '編碼'

當我運行這段代碼

url = soup.find('div',attrs={"class":"entry-content"}).findAll('div', attrs={"class":None}) 


fobj = open('D:\Scrapping\parveen_urls.txt', 'w') 

for getting in url: 
    fobj.write(getting.string.encode('utf8'))

但是，當我使用find而不是findAll我得到一個url。我如何通過findAll從對象獲取所有的url？

來源

2016-02-05 M Talha Afzal

如果你使用'.text'而不是'.string'？ – alecxe

@alecxe是的，它的工作，但你能告訴我爲什麼？ –

'NoneType' object has no attribute 'encode'

您正在使用.string。如果一個標籤有多個孩子.string將None（docs）：

如果標籤的唯一的孩子是另一個標籤，並且標籤具有.string，然後父標籤被認爲具有相同的.string如其子：

改爲使用.get_text()。

來源

2016-02-05 16:28:59 alecxe

下面我提供兩個實施例和一個可能的解決方案：

實施例1示出了工作示例。
示例2顯示了一個非工作示例，提高了您報告的錯誤。
解決方案顯示可能的解決方案。

實施例1：將HTML具有預期的div

doc = ['<html><head><title>Page title</title></head>', 
    '<body><div class="entry-content"><div>http://teste.com</div>', 
    '<div>http://teste2.com</div></div></body>', 
    '</html>']  
soup = BeautifulSoup(''.join(doc)) 
url = soup.find('div',attrs={"class":"entry-content"}).findAll('div', attrs={"class":None}) 
fobj = open('.\parveen_urls.txt', 'w') 
for getting in url: 
    fobj.write(getting.string.encode('utf8'))

例2：HTML沒有在內容中的預期的div

doc = ['<html><head><title>Page title</title></head>', 
    '<body><div class="entry"><div>http://teste.com</div>', 
    '<div>http://teste2.com</div></div></body>', 
    '</html>']  
soup = BeautifulSoup(''.join(doc)) 

""" 
The error will rise here because the first find does not return nothing, 
and nothing is equals to None. Calling "findAll" on a None object will 
raise: AttributeError: 'NoneType' object has no attribute 'findAll' 
""" 
url = soup.find('div',attrs={"class":"entry-content"}).findAll('div', attrs={"class":None}) 
fobj = open('.\parveen_urls2.txt', 'w') 
for getting in url: 
    fobj.write(getting.string.encode('utf8'))

可能的解決方案：

doc = ['<html><head><title>Page title</title></head>', 
    '<body><div class="entry"><div>http://teste.com</div>', 
    '<div>http://teste2.com</div></div></body>', 
    '</html>']  
soup = BeautifulSoup(''.join(doc)) 
url = soup.find('div',attrs={"class":"entry-content"}) 

""" 
Deal with documents that do not have the expected html structure 
""" 
if url: 
    url = url.findAll('div', attrs={"class":None}) 
    fobj = open('.\parveen_urls2.txt', 'w') 
    for getting in url: 
     fobj.write(getting.string.encode('utf8')) 
else: 
    print("The html source does not comply with expected structure")

來源

2016-02-05 15:43:11 jcfausto

NoneType對象有沒有屬性 '編碼'（網絡報廢）

回答

實施例1：將HTML具有預期的div

例2：HTML沒有在內容中的預期的div

可能的解決方案：

NoneType對象有沒有屬性 '編碼'（網絡報廢）

回答

實施例1：將HTML具有預期的div

例2：HTML沒有在內容中的預期的div

可能的解決方案：

相關問題