如何獲得第二個子元素？

請幫助修復腳本。如何獲得第二個子元素？

import urllib.request 
import urllib.parse 
import re 

import requests 
import bs4 

beginIndex = 1000 
endIndex = 1010 
prefix = "http://www.inpic.ru" 

for i in range(beginIndex, endIndex): 
    req = requests.get(prefix + '/image/' + str(i)) 
    if req.status_code == requests.codes.ok: 
     print(i, '\t', req.status_code, '\t', req, end='\n') 
     soup = bs4.BeautifulSoup(req.content) 
     #print(soup.prettify()) 
     name = soup.find("td", {"class": "post_title"}).contents[1].contents 
     author = soup.find("td", {"class": "post_title"}).contents[2].contents[1].contents 
     #name = replace(name, '/', '_') 
     print(name, '\t', author)

錯誤消息：

Traceback (most recent call last): File 
"C:\VINT\OPENSERVER\OpenServer\domains\localhost\python\parse_html\1\q.py", 
line 19, in <module> 
    author = soup.find("td", {"class": "post_title"}).contents[2].contents[1].contents File 
"C:\Python33\lib\site-packages\bs4\element.py", line 675, in 
__getattr__ 
    self.__class__.__name__, attr)) AttributeError: 'NavigableString' object has no attribute 'contents'

的問題是，它是不可能的與類「date_author」列表中的一個元素的內容。我只需要使用命令「內容」（NOT nextSibling等）

來源

2014-02-18 Sergey

爲什麼你不能使用'next_sibling'或類似的？ – Birei

我剛插入''http：// www.inpic.ru/image/1010''並通過bs4和'soup.find（「td」，{「class」：「post_title」}）運行。 [2]'只是字符串文字''\ n''。你需要重新思考你的解析策略。我不知道你爲什麼只使用內容，但這是一個脆弱的策略。考慮使用'select'或鏈接'find'操作，至少。 – roippi

使用

soup.find("td", {"class": "post_title"}).contents[1].string

爲soup.find("td", {"class": "post_title"}).contents[1]是NavigableString。

來源

2014-02-18 15:05:02 njzk2

如何獲得第二個子元素？

回答

相關問題