2016-11-19 70 views
0

我試圖從這個HTML 「SlutrengøringALM(DKK 750,00)丹麥克朗。」:Beautifulsoup越來越跨度標籤價值的內容對

<div id="bookingpartoptionalitems" class="paddingLeft paddingRight"> 
<div class="title paddingTop">Valgfrie tilkøb:</div> 
<div class="dots dotsHeight alignment-line"> 
    <div class="alignment-container optional-items-controlarea"><span class="control-area checkboxArea paddingRight negMarginTop"> <input id="fvF3625F31BE0A4F0A8DCD3F59477CD535" type="checkbox" class="checkbox" value="1"></span> 
    </div> 
    <div class="alignment-container optional-items-namearea"><span class="BookingDataItemName paddingRight"><label for="fvF3625F31BE0A4F0A8DCD3F59477CD535">Håndklæder (leje)</label> <span class="BookingDataItemUnitPrice">(<span class="currency">DKK</span> <span class="value">112,00</span>)</span> 
     </span> 
    </div> 
    <div class="alignment-container"><span class="BookingDataItemTotalPrice paddingLeft"><span class="currency">DKK</span> <span class="value">0,00</span></span> 
    </div> 
    <div class="alignment-container"></div> 
</div> 
<div class="dots dotsHeight alignment-line"> 
    <div class="alignment-container optional-items-controlarea"><span class="control-area checkboxArea paddingRight negMarginTop"><input id="fvC7796D75FE6D429187EB9705D87B0289" type="checkbox" class="checkbox" value="1"></span> 
    </div> 
    <div class="alignment-container optional-items-namearea"><span class="BookingDataItemName paddingRight"><label for="fvC7796D75FE6D429187EB9705D87B0289">Slutrengøring alm.</label> <span class="BookingDataItemUnitPrice">(<span class="currency">DKK</span> <span class="value">750,00</span>)</span> 
     </span> 
    </div> 
    <div class="alignment-container"><span class="BookingDataItemTotalPrice paddingLeft"><span class="currency">DKK</span> <span class="value">0,00</span></span> 
    </div> 
    <div class="alignment-container"></div> 
</div> 
<div class="dots dotsHeight alignment-line"> 
    <div class="alignment-container optional-items-controlarea"><span class="control-area checkboxArea paddingRight negMarginTop"><input id="fv64F0EAE9857F4D219BB3EDE247ED6EA8" type="checkbox" class="checkbox" value="1"></span> 
    </div> 
    <div class="alignment-container optional-items-namearea"><span class="BookingDataItemName paddingRight"><label for="fv64F0EAE9857F4D219BB3EDE247ED6EA8">Leje Sengelinnede </label> <span class="BookingDataItemUnitPrice">(<span class="currency">DKK</span> <span class="value">112,00</span>)</span> 
     </span> 
    </div> 
    <div class="alignment-container"><span class="BookingDataItemTotalPrice paddingLeft"><span class="currency">DKK</span> <span class="value">0,00</span></span> 
    </div> 
    <div class="alignment-container"></div> 
</div> 
<div class="dots dotsHeight alignment-line last-item"> 
    <div class="alignment-container optional-items-controlarea"><span class="control-area checkboxArea paddingRight negMarginTop"><input id="fvF418ABD7452A45C2B22F98AE5348B13F" type="checkbox" class="checkbox" value="1"></span> 
    </div> 
    <div class="alignment-container optional-items-namearea"><span class="BookingDataItemName paddingRight"><label for="fvF418ABD7452A45C2B22F98AE5348B13F">Internet</label> <span class="BookingDataItemUnitPrice">(<span class="currency">DKK</span> <span class="value">149,00</span>)</span> 
     </span> 
    </div> 
    <div class="alignment-container"><span class="BookingDataItemTotalPrice paddingLeft"><span class="currency">DKK</span> <span class="value">0,00</span></span> 
    </div> 
    <div class="alignment-container"></div> 
</div> 

我試圖bsObj.select("#bookingpartoptionalitems label")其輸出:

[<label for="fvEC6D027BF92643FB915F1B3D40C2ADAC">Senget▒jspakke</label>, <label for="fv4C0AAC0318FC408C9D42A6EC152AE878">Barnestol</label>, <label for="fv1B2B8ADFBAA74CE094B55514FF02674F">Barneseng</label>, <label for="fvCA3BB2602AD44C07A1F38B430A73D699">Ekstra Fryser (100L) inkl. levering</label>, <label for="fv7F8D503E6BE84A78A54C92001C195DCA">Levering/afhentning tilk▒bte varer</label>, <label for="fv62D7E7BCC1914FBB82802AF9A0D10B27">Tr▒kvogn</label>, <label for="fvF3D92DC8F8BC43F48525A9D032A6130F">Afbestillingsforsikring (ingen selvrisiko)</label>, <label for="fv3CED5B2C3ADC4309A3B7EEA11BBC924D">Kombiforsikring (ingen selvrisiko)</label>, <label for="fv5BC0B453EA5A42E19BFCAC87739CC515">Beach Bowl Key2Activity</label>] 

bsObj.select("#bookingpartoptionalitems .value")其輸出:

[<span class="value">105,00</span>, <span class="value">0,00</span>, <span class="value">0,00</span>, <span class="value">0,00</span>, <span class="value">0,00</span>, <span class="value">0,00</span>, <span class="value">300,00</span>, <span class="value">0,00</span>, <span class="value">140,00</span>, <span class="value">0,00</span>, <span class="value">125,00</span>, <span class="value">0,00</span>, <span class="value">243,00</span>, <span class="value">0,00</span>, <span class="value">360,00</span>, <span class="value">0,00</span>, <span class="value">119,00</span>, <span class="value">0,00</span>] 

是否有方法可以成對獲取標籤和值。由於標籤for="fvC7796D75FE6D429187EB9705D87B0289"似乎是動態生成的,因此無法使用。

我希望有人可以提供幫助。

回答

1

所以你想獲得所有的標籤值對?一種方法是,你可以運行你已經嘗試過的兩個查詢併合並數據,因爲我相信它將是有序的。或者你可以做這樣的事情:

items = bsObj.find_all('div', class_='optional-items-namearea') 

for item in items: 
    print(item.label.get_text(), item.find('span', class_='value').get_text()) 

這將找到所有與類"optional-items-namearea"的項目,然後在它們之間迭代並提取內標籤的文本。對於需要使用查找的值,因爲它位於另一個元素內。

對於示例數據輸出將是:

Håndklæder (leje) 112,00 
Slutrengøring alm. 750,00 
Leje Sengelinnede 112,00 
Internet 149,00 
1
from bs4 import BeautifulSoup 

soup = BeautifulSoup(html, 'lxml') 
divs = soup.find_all(class_="alignment-container optional-items-namearea") 

for div in divs: 
    pair = div.get_text(strip=True) 
    print(pair) 

出來:

Håndklæder (leje)(DKK112,00) 
Slutrengøring alm.(DKK750,00) 
Leje Sengelinnede(DKK112,00) 
Internet(DKK149,00)