2016-06-21 55 views
1

的標籤中提取數據,這是HTML源代碼的一部分,我很感興趣:用美麗的湯,以相同的屬性

<div class="mreinfwpr" id="mhd"> 
    <p class="mreinfp">Hours of Operation <a href="javascript:void(0);" class="" id="vhall" onclick="houroperate('all')">(View all)</a><a href="javascript:void(0);" class="dn" id="swless" onclick="houroperate('less')">(Show less)</a></p> 
    <ul id="hroprt" class="alstdul"> 
     <li class="mreinfli"> 
           <span class="mreinflispn1">Today</span><span class="mreinflispn2"><span>11:30 am - 11:30 pm</span> 
          </span><span class="mreinflispn3">Closed Now</span> </li> 
    </ul> 
    <!-- View All Work Timings Vertically --> 
    <ul class="alstdul dn" id="statHr"> 
       <li class="mreinfli"> 
       <span class="mreinflispn1"> Monday </span><span class="mreinflispn2">11:30 am - 11:30 pm</span> 
      </li> 
       <li class="mreinfli"> 
       <span class="mreinflispn1"> Tuesday </span><span class="mreinflispn2">11:30 am - 11:30 pm</span> 
      </li> 
       <li class="mreinfli"> 
       <span class="mreinflispn1"> Wednesday </span><span class="mreinflispn2">11:30 am - 11:30 pm</span> 
      </li> 
       <li class="mreinfli"> 
       <span class="mreinflispn1"> Thursday </span><span class="mreinflispn2">11:30 am - 11:30 pm</span> 
      </li> 
       <li class="mreinfli"> 
       <span class="mreinflispn1"> Friday </span><span class="mreinflispn2">11:30 am - 11:30 pm</span> 
      </li> 
       <li class="mreinfli"> 
       <span class="mreinflispn1"> Saturday </span><span class="mreinflispn2">11:30 am - 11:30 pm</span> 
      </li> 
       <li class="mreinfli"> 
       <span class="mreinflispn1"> Sunday </span><span class="mreinflispn2">11:30 am - 11:30 pm</span> 
      </li> 
     </ul> 

</div> 

       <div class="mreinfwpr"> 
    <p class="mreinfp">Also Listed in</p> 
    <ul class="alstdul"> 


         <li> 
        <a onclick="_ct('alsocat', 'dtpg', '17592186044416');" href="http://www.justdial.com/Bangalore/Pubs-<near>-Indira-Nagar-2nd-Stage/ct-1000027567" title="Pubs in Indira-Nagar-2nd-Stage, Bangalore">Pubs</a> 

           <!-- <li class="spc"></li> --> 

         <li> 
        <a onclick="_ct('alsocat', 'dtpg', '17592186044416');" href="http://www.justdial.com/Bangalore/Pizza-Outlets-<near>-Indira-Nagar-2nd-Stage/ct-50105" title="Pizza Outlets in Indira-Nagar-2nd-Stage, Bangalore">Pizza Outlets</a> 

           <!-- <li class="spc"></li> --> 



         <li> 
        <a onclick="_ct('alsocat', 'dtpg', '17592186044416');" href="http://www.justdial.com/Bangalore/Restaurants-<near>-Indira-Nagar-2nd-Stage/ct-304085" title="Restaurants in Indira-Nagar-2nd-Stage, Bangalore">Restaurants</a> 

           <!-- <li class="spc"></li> --> 

         <li> 
        <a onclick="_ct('alsocat', 'dtpg', '17592186044416');" href="http://www.justdial.com/Bangalore/Lounge-Bars-<near>-Indira-Nagar-2nd-Stage/ct-597637" title="Lounge Bars in Indira-Nagar-2nd-Stage, Bangalore">Lounge Bars</a> 

           <!-- <li class="spc"></li> --> 



         <li> 
        <a onclick="_ct('alsocat', 'dtpg', '17592186044416');" href="http://www.justdial.com/Bangalore/Microbrewery-Pubs-<near>-Indira-Nagar-2nd-Stage/ct-1041785821" title="Microbrewery Pubs in Indira-Nagar-2nd-Stage, Bangalore">Microbrewery Pubs</a> 

           <!-- <li class="spc"></li> --> 

         <li> 
        <a onclick="_ct('alsocat', 'dtpg', '17592186044416');" href="http://www.justdial.com/Bangalore/Nightlife-Restaurants-<near>-Indira-Nagar-2nd-Stage/ct-1041746883" title="Nightlife Restaurants in Indira-Nagar-2nd-Stage, Bangalore">Nightlife Restaurants</a> 

           <!-- <li class="spc"></li> --> 



         <li> 
        <a onclick="_ct('alsocat', 'dtpg', '17592186044416');" href="http://www.justdial.com/Bangalore/Foodie-Delight-<near>-Indira-Nagar-2nd-Stage/ct-1041818989" title="Foodie Delight in Indira-Nagar-2nd-Stage, Bangalore">Foodie Delight</a> 

           <!-- <li class="spc"></li> --> 


           <!-- <li class="spc"></li> --> 




           <!-- <li class="spc"></li> --> 


           <!-- <li class="spc"></li> --> 

            <li> 
       <a href="javascript:void(0);" onclick="_ct('morlstdin', 'dtpg'); 
         openDiv('alsp');">more...</a> 
      </li> 
      </ul> 
</div> 
     <div class="mreinfwpr"> 
    <p class="mreinfp">Services</p> 
         <span class="srihd">General</span> 
      <ul class="alstdul"> 
                   <!-- <tr > --> 
              <li><img class="srimg" src="http://www.justdial.com/public/images/icon/bar.png" width="20" height="20" /><span class="sritxt">Bar             </span></li> 
               <!-- <td class="spc"></td> --> 
            <li><img class="srimg" src="http://www.justdial.com/public/images/icon/checkmarkNew.png" width="20" height="20" /><span class="sritxt">Outdoor Seating             </span></li> 
               <!-- </tr> --> 
                  <!-- <tr > --> 
              <li><img class="srimg" src="http://www.justdial.com/public/images/icon/checkmarkNew.png" width="20" height="20" /><span class="sritxt">Alcohol             </span></li> 
               <!-- <td class="spc"></td> --> 
            <li><img class="srimg" src="http://www.justdial.com/public/images/icon/checkmarkNew.png" width="20" height="20" /><span class="sritxt">AC             </span></li> 
               <!-- </tr> --> 
                  <!-- <tr class="reset" > --> 
              <li><img class="srimg" src="http://www.justdial.com/public/images/icon/checkmarkNew.png" width="20" height="20" /><span class="sritxt">WiFi             </span></li> 
               <!-- <td class="spc"></td> --> 
            <li><img class="srimg" src="http://www.justdial.com/public/images/icon/checkmarkNew.png" width="20" height="20" /><span class="sritxt">Dinein             </span></li> 
               <!-- </tr> --> 
            </ul> 
     </div> 
     <div class="mreinfwpr"> 
    <p class="mreinfp">Modes of Payment</p> 
    <ul class="alstdul"> 

            <li>Cash</td> 
           <!-- <td class="spc"></td> --> 
           <li>Master Card</td> 
           </li> 

            <li>Visa Card</td> 
           <!-- <td class="spc"></td> --> 
           <li>Debit Cards</td> 
           </li> 

            <li>Credit Card</td> 
           <!-- <td class="spc"></td> --> 
       </div> 
      <div class="mreinfwpr"> 
    <p class="mreinfp">Year Established</p> 
    <ul class="alstdul"> 
     <li> 2010</li> 
    </ul> 
</div> 

我想目前在建立支付類別和年度模式數據。在這裏它是:

Modes of Payment 

Cash 
Master Card 
Visa Card 
Debit Cards 
Credit Card 

Year Established 

2010 

我試圖在美麗的湯使用該命令:

modes_of_payment = bSoup.select( 'DIV [類= mreinfwpr] UL〔類= alstdul]')

我最終得到所有正在div.mreinfwpr >>ul.alstdul元素

我如何獲得所需的數據?

在此先感謝!

回答

2

轉到next sibling,一旦你找到所需p元素:

from pprint import pprint 

from bs4 import BeautifulSoup 

data = """ 
your HTML string 
""" 

soup = BeautifulSoup(data, "html5lib") 
for p in soup.find_all("p", text=["Modes of Payment", "Year Established"]): 
    print(p.get_text()) 

    for item in p.find_next_sibling("ul").find_all("li"): 
     print(item.get_text(strip=True)) 

    print("----") 
+0

非常感謝!完美的作品! :) – joshirohit66

+0

快速的問題。如果p標籤不在那裏呢?一個人會怎麼做呢? – joshirohit66

+0

@ joshirohit66好吧,beautifulsoup4非常靈活,它有各種各樣的方式來上下,橫向,通過任何你可以想象的東西來定位元素:) – alecxe