我沒有lxml.cssselect
的專業知識(我有一個快速去,甚至無法設置元素樹,所以一直無法複製你的確切問題)。不過,我已經成功使用了可能對您有用的等效lxml
方法。
from lxml import html
import requests
url = 'http://abcnews.go.com/US/wireStory/man-jail-writing-racist-graffiti-refugees-homes-33488053'
page = requests.get(url)
tree = html.fromstring(page.text)
p_elements = tree.cssselect('p[itemprop="articleBody"]')
print(p_elements)
輸出:
[<Element p at 0xa503ae8>,
<Element p at 0xa503db8>,
<Element p at 0xa503bd8>,
<Element p at 0xa54b1d8>,
<Element p at 0xa54b0e8>,
<Element p at 0xa54b138>,
<Element p at 0xa54b188>]
通常,使用lxml
當我發現,選擇與XPath元素是遠遠比CSS選擇更加靈活。
你介意發佈你的代碼嗎? – gtlambert