2017-06-22 19 views
0

我剛開始通過玩基於文本的瀏覽器遊戲開始我的旅程「學習python」。我買了很多不同的武器,我可以賣,但首先我想知道它的價值。 但我無法弄清楚如何讓選擇特定的數據表例如:硒,蟒蛇從表格中提取特定數據,沒有ID,類等

我的代碼:

from selenium import webdriver 
from bs4 import BeautifulSoup as bs 
browser = webdriver.Firefox() 
browser.get("http://www.mafiaway.nl/shop.php?p=sell") 
source = browser.page_source 
soup = bs(source, "html.parser") 
for td in soup.findAll("td"): 
    print(td.text) 

得到這個:

Mes 

Mes Inclusief training en gratis herregistraties 
Geld terug 
$5.700 >>> is what i want 
Minpunten 
Power -30 
Aantal 
Nog 8.758.118 >>> is what i want 
Verkopen 
+ like 50 other tables like this 

所以基本上我的代碼打印出來的表,但我只想從上面兩個對象..

html代碼=

<table align="center" width="100%"> 
    <form method="post"> 
    </form> 
    </table> 
    <table align="center" cellspacing="1" width="610"> 
    <tbody> 
    <tr> 
    <td class="subtitle" colspan="6"> 
     Mes 
    </td> 
    </tr> 
    <tr> 
    <td align="center" class="maintxt" rowspan="5" width="150"> 
     <img height="150" src="images/item-Knife.gif"/> 
    </td> 
    <td class="maintxt" colspan="5" width="450"> 
     <span class="tekstheader"> 
     Mes 
     </span> 
     <br/> 
     <br/> 
     <i> 
     Inclusief training en gratis herregistraties 
     </i> 
    </td> 
    </tr> 
    <tr> 
    <td class="maintxt" width="225"> 
     <img class="icon" src="images/icons/money.png"/> 
     <b> 
     Geld terug 
     </b> 
    </td> 
    <td class="maintxt" colspan="4" width="225"> 
     $5.700 
    </td> 
    </tr> 
    <tr> 
    <td class="maintxt" width="225"> 
     <img class="icon" src="images/icons/chart_pie_add.png"/> 
     <b> 
     Minpunten 
     </b> 
    </td> 
    <td class="maintxt" colspan="4" width="225"> 
     <span class="errorbold"> 
     <b> 
     Power -30 
     </b> 
     </span> 
    </td> 
    </tr> 
    <tr> 
    <td class="maintxt" width="225"> 
     <img class="icon" src="images/icons/group.png"/> 
     <b> 
     Aantal 
     </b> 
    </td> 
    <td class="maintxt" colspan="4" width="225"> 
     Nog 8.758.118 
    </td> 
    </tr> 
    <tr> 
    <td class="maintxt" width="225"> 
     <img class="icon" src="images/icons/basket.png"/> 
     <b> 
     Verkopen 
     </b> 
    </td> 
    <td align="center" class="maintxt" colspan="4" width="225"> 
     <input name="num" size="4" type="text"/> 
     <input name="koop" type="submit" value="Verkoop"/> 
     <input maxlength="20" name="id" type="hidden" value="1"/> 
    </td> 
    </tr> 
    </tbody> 
    </table> 
    <br/> 
    <form method="post"> 
    <table align="center" cellspacing="1" width="610"> 
    <tbody> 
    <tr> 
     <td class="subtitle" colspan="6"> 
     Walter P99 
     </td> 
    </tr> 
    <tr> 
     <td align="center" class="maintxt" rowspan="5" width="150"> 
     <img height="150" src="images/item-Walter_P99.gif"/> 
     </td> 
     <td class="maintxt" colspan="5" width="450"> 
     <span class="tekstheader"> 
     Walter P99 
     </span> 
     <br/> 
     <br/> 
     <i> 
     Inclusief training en gratis herregistraties 
     </i> 
     </td> 
    </tr> 
    <tr> 
     <td class="maintxt" width="225"> 
     <img class="icon" src="images/icons/money.png"/> 
     <b> 
     Geld terug 
     </b> 
     </td> 
     <td class="maintxt" colspan="4" width="225"> 
     $14.250 
     </td> 
    </tr> 
    <tr> 
     <td class="maintxt" width="225"> 
     <img class="icon" src="images/icons/chart_pie_add.png"/> 
     <b> 
     Minpunten 
     </b> 
     </td> 
     <td class="maintxt" colspan="4" width="225"> 
     <span class="errorbold"> 
     <b> 
     Power -75 
     </b> 
     </span> 
     </td> 
    </tr> 
    <tr> 
     <td class="maintxt" width="225"> 
     <img class="icon" src="images/icons/group.png"/> 
     <b> 
     Aantal 
     </b> 
     </td> 
     <td class="maintxt" colspan="4" width="225"> 
     Nog 37.251 
     </td> 
    </tr> 
    <tr> 
     <td class="maintxt" width="225"> 
     <img class="icon" src="images/icons/basket.png"/> 
     <b> 
     Verkopen 
     </b> 
     </td> 
     <td align="center" class="maintxt" colspan="4" width="225"> 
     <input name="num" size="4" type="text"/> 
     <input name="koop" type="submit" value="Verkoop"/> 
     <input maxlength="20" name="id" type="hidden" value="2"/> 
     </td> 
    </tr> 
    </tbody> 
    </table> 
+0

請發佈整個表格的html代碼,並請正確地將其發佈爲正確的。也可以在你想要的內容中更具體。 –

+0

好的,我想提取所有的$數字和「nog(數字)」 – patrick

回答

0

我認爲這應該爲你的工作,因爲它認爲所有這些文本包含$的td元素爲木栓數數,Nog

number_elements = browser.find_elements_by_xpath("//td[contains(text(), '$')]") 

nog_number_elements = browser.find_elements_by_xpath("//td[contains(text(), 'Nog')]") 

for number_element in number_elements: 
    print(number_element.text) 
for nog_number_element in nog_number_elements: 
    print(nog_number_element.text) 

注意,這是硒的方法來做到這一點,我沒有使用BeautifulSoup的。

+0

謝謝,這工作,我想我需要開始學習如何使用XPath :) – patrick

0

我會找到所有包含一個「$」或細胞的表「木栓」

cells = browser.find_elements_by_xpath("//td[contains(text(), '$') or contains(text(), 'Nog')]") 

for cell in cells: 
    print cell.text