限制從網絡爬蟲提取的行

-1

所以，我有這個真棒網絡爬蟲代碼。它從所述站點獲取請求的數據並粘貼與其關聯的鏈接。（好男孩）限制從網絡爬蟲提取的行

現在的問題是，如何限制提取的數據說5行。我試圖把「LIMIT 5」（即我們通常做的PHP SQL查詢），但它沒有工作..

我的代碼去如下::

<div class="news-entry"> 
      <div class="newsblock"> 
       <div style="clear:both"></div> 
        <h2> 
         <a rel="nofollow" target="_blank" href="http://www.usmle-forums.com/usmle-step-3-forum/"> 
          USMLE-Forums :: STEP-3   
         </a> 
        </h2> 
       <ul> 
        <?php 
         function get_datafour($url) { 
         $ch = curl_init(); 
         curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); 
         curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); 
         curl_setopt($ch, CURLOPT_URL,$url); 
         $result=curl_exec($ch); 
         curl_close($ch); 
         return $result; 
         } 
         $returned_content = get_datafour('http://www.usmle-forums.com/usmle-step-3-forum/'); 
         $first_step = explode('<tbody id="threadbits_forum_30">' , $returned_content); 
         $second_step = explode('</tbody>', $first_step[1]); 
         $third_step = explode('<tr>', $second_step[0]); 
         // print_r($third_step); 
         foreach ($third_step as $element) { 
         $child_first = explode('<td class="alt1"' , $element); 
         $child_second = explode('</td>' , $child_first[1]); 
         $child_third = explode('<a href=' , $child_second[0]); 
         $child_fourth = explode('</a>' , $child_third[1]); 
         $final = "<a href=".$child_fourth[0]."</a></br>"; 
        ?> 
        <li target="_blank" class="itemtitle"> 
         <span class="item_new"></span><?php echo $final?> 
        </li> 
        <?php 
         } 
        ?>  
       </ul>   
       <div style="clear:both"></div> 
      </div> 
     </div>

任何建議都感激..

第五屆結果後

來源

2017-02-10 harishk

5次迭代後在foreach循環中中斷 –

以及如何做？ – harishk

看到Manvir singh –

休息的foreach循環

foreach ($third_step as $key=>$element) { 
    //Your Logic Here 
    if($key==4){ 
     break; 
    } 
}

因爲指數從0開始我們使用$鍵== 4希望你得到它

來源

2017-02-10 05:31:00

夥計的答案，這裏有什麼需要註冊的內容？ – harishk

登錄內容。？ –

「//你的登錄在這裏」是啊......你提到它是正確的...... – harishk

限制從網絡爬蟲提取的行

回答

相關問題