2017-07-28 78 views
0

我正在設置抓取工具以獲取產品信息,爲此,我使用機械化,結果是nokogiri,我有一個URL(http://www.megamamute.com.br/brother%205652),它只返回一個產品,但我不能得到正確的正則表達式來獲得這個項目的價格,我要的是裏面的DIV稱爲X-產品:機械化和Nokogiri:試圖搜索div中的項目

HTML

<div class="pager top" id="PagerTop_66064345"></div><div id="ResultItems_66064345" class="prateleira vitrine"><div class="prateleira vitrine n1colunas"><ul><li layout="45e718bf-51b0-49c4-8882-725649af0594" class="informatica--teclado-notebook-tablet-pen-drive-|-megamamute last"> 

<input type="hidden" class="x-id" value="55492" /> 

<div class="x-product"> 

    <div class="x-selos"> 
     <p class="flag desconto-10--off-no-boleto">Desconto 10% off no boleto</p> 

     <p class="flag Informática" style="display:none;">Informática</p> 
    </div> 

    <div class="x-get-skuId x-hide"><div class="buy-button-normal" id="55492" name="55492"><a class="buy-button-normal-a55492" href="https://www.megamamute.com.br/checkout/cart/add?sku=55492&qty=1&seller=1&sc=1&price=224900&cv=254ca7d1b9d7fb34e47ca55ceec1b2c0_geral:0F62E16B17B76A6FE17EC7C23A655D8B&sc=1" title="Comprar">Comprar</a><input type="hidden" value="cart" class="buy-button-normal-go-to-cart-55492" /></div></div> 

    <div class="x-departamento"> 
     Multifuncional Laser Monocromática 
    </div> 

    <div class="x-image"> 
     <a class="x-productImage" title="Impressora Multifuncional Brother DCP-L5652DN Laser Mono" href="http://www.megamamute.com.br/impressora-multifuncional-brother-dcp-l5652dn-laser-mono/p"> 
      <img src="http://megamamute.vteximg.com.br/arquivos/ids/6658677-500-500/55492_original.jpg" width="500" height="500" alt="55492_original" id="" /> 
     </a> 
    </div> 

    <h2 class="product-name"> 
     <a title="Impressora Multifuncional Brother DCP-L5652DN Laser Mono" href="http://www.megamamute.com.br/impressora-multifuncional-brother-dcp-l5652dn-laser-mono/p"> 
      Impressora Multifuncional Brother DCP-L5652DN Laser Mono 
     </a> 
    </h2> 

    <div data-trustvox-product-code="55492"></div> 

       <div class="x-price"> 
      <a title="Impressora Multifuncional Brother DCP-L5652DN Laser Mono" href="http://www.megamamute.com.br/impressora-multifuncional-brother-dcp-l5652dn-laser-mono/p"> 

             <span class="oldPrice"> 
         R$ 2.899,00 
        </span> 

        <span class="x-bestPrice"> 
         R$ 2.249,00 
        </span> 

       <span class="x-installment"> 
               10X de <strong>R$ 224,90</strong> sem juros 
             </em> 
      </a> 

     </div> 

     <!--<div class="x-opiniao">--> 
     <!-- <span class="rating-produto avaliacao0">0</span> <span class="navaliacao">(0)</span>--> 
     <!--</div>--> 



     <div class="x-info-product"> 
      <ul> 
       <li class="x-info"><a href="http://www.megamamute.com.br/impressora-multifuncional-brother-dcp-l5652dn-laser-mono/p"></a></li> 
       <li class="x-favorite"><a href="#"></a></li> 
       <li class="x-move"><a href="#"></a></li> 
       <li class="x-add"><a href="#"></a></li> 
      </ul> 

     </div> 

     <div class="x-hover"> 
      <div class="x-buy"> <a class="x-productImage" title="Impressora Multifuncional Brother DCP-L5652DN Laser Mono" href="http://www.megamamute.com.br/impressora-multifuncional-brother-dcp-l5652dn-laser-mono/p"> Comprar </a></div> 
      <a class="x-hoverHref" title="Impressora Multifuncional Brother DCP-L5652DN Laser Mono" href="http://www.megamamute.com.br/impressora-multifuncional-brother-dcp-l5652dn-laser-mono/p"></a> 
      <ul> 
       <li class="x-info"><a href="http://www.megamamute.com.br/impressora-multifuncional-brother-dcp-l5652dn-laser-mono/p"></a></li> 
       <li class="x-favorite"><a href="#"></a></li> 
       <li class="x-move"><a href="#"></a></li> 
       <li class="x-add"><a href="#"></a></li> 
      </ul> 

     </div> 


    <div class="x-brand"><p class="texto brand brother">brother</p></div> 

,我會喜歡繼續,並檢查如何獲得幾種產品,哈弗Ë幾個div的「X產品」,我無法理解的方式來組裝與所有的數組,並獲得每一個內部的信息,

下面是我的代碼片段,很簡單,

紅寶石片斷

require 'mechanize' 
mechanize = Mechanize.new 
agent= mechanize.get('http://www.megamamute.com.br/brother%205652 
match = agent.search("/xproduct/") 
puts match.html 

非常感謝你,

回答

0

你接近。您可以使用CSS選擇器並添加價格的容器元素的類:

require 'mechanize' 
mechanize = Mechanize.new 
agent= mechanize.get('http://www.megamamute.com.br/brother%205652') 
match = agent.search(".x-product .x-bestPrice") 
puts match.text.strip 

#=> "R$ 2.249,00" 
+0

完美!謝謝! – fzuid