我一直在試圖收集Youtube遊戲中的直播頻道/觀看者列表。我正在使用硒與Python強制網站向下滾動頁面,以便加載更多的11個頻道。作爲參考,this是我正在處理的網頁。在Python中用Selenium查找元素
我找到了我想要的數據的位置,但我正在努力讓硒去那裏。該部分我有麻煩看起來像這樣:
<div class="style-scope ytg-gaming-video-renderer" id="video-metadata"><span class="title ellipsis-2 style-scope ytg-gaming-video-renderer"><ytg-nav-endpoint class="style-scope ytg-gaming-video-renderer x-scope ytg-nav-endpoint-2"><a href="/watch?v=FFKSD1HHrdA" tabindex="0" class="style-scope ytg-nav-endpoint" target="_blank">
Live met Bo3
</a></ytg-nav-endpoint></span>
<div class="channel-info small layout horizontal center style-scope ytg-gaming-video-renderer">
<ytg-owner-badges class="style-scope ytg-gaming-video-renderer x-scope ytg-owner-badges-0">
<template class="style-scope ytg-owner-badges" is="dom-repeat"></template>
</ytg-owner-badges>
<ytg-formatted-string class="style-scope ytg-gaming-video-renderer">
<ytg-nav-endpoint class="style-scope ytg-formatted-string x-scope ytg-nav-endpoint-2"><a href="/channel/UCD8Q9V5wgo8o0XGfUqsRrDQ" tabindex="0" class="style-scope ytg-nav-endpoint" target="_blank">Rico Eeman</a>
</ytg-nav-endpoint>
</ytg-formatted-string>
</div><span class="ellipsis-1 small style-scope ytg-gaming-video-renderer" id="video-viewership-info" hidden=""></span>
<div id="metadata-badges" class="small style-scope ytg-gaming-video-renderer">
<ytg-live-badge-renderer class="style-scope ytg-gaming-video-renderer x-scope ytg-live-badge-renderer-1">
<template class="style-scope ytg-live-badge-renderer" is="dom-if"></template>
<span aria-label="" class="text layout horizontal center style-scope ytg-live-badge-renderer">4 watching</span>
<template class="style-scope ytg-live-badge-renderer" is="dom-if"></template>
</ytg-live-badge-renderer>
</div>
</div>
目前,我想:
#This part works fine. I can use the unique ID
meta_data = driver.find_element_by_id('video-metadata')
#This part is also fine. Once again, it has an ID.
viewers = meta_data.find_element_by_id('metadata-badges')
print(viewers.text)
不過,我一直有麻煩頻道名稱(在這個例子中'Rico Eeman'
,它在第一個嵌套的div標籤下)。由於它的一種化合物類的名字,我無法找到類名稱的元素,並嘗試下面的XPath不工作:
name = meta_data.find_element_by_xpath('/div[@class="channel-info small layout horizontal center style-scope ytg-gaming-video-renderer"]/ytg-formatted-string'
name = meta_data.find_element_by_xpath('/div[1])
他們既能提高找不到錯誤的元素。我不確定在這裏做什麼。有沒有人有一個工作解決方案?
謝謝,用css選擇器工作完美!但是,如果我用result.text打印xpath的結果,它會打印空字符串。 編輯:這是沒有問題的課程,因爲它與CSS選擇器一起工作! :) – Pieter