2015-05-27 51 views
0

我使用curl來刮掉一個HTML頁面。它完美地剔除了前置標籤之間的數據。不過,我想跳過前五行。有什麼我可以添加到代碼來做到這一點?這裏是我的代碼:php curl代碼跳過被刮掉的行

<?php 

function curl_download($Url){ 

if (!function_exists('curl_init')){ 
    die('cURL is not installed. Install and try again.'); 
} 

$ch = curl_init(); 
curl_setopt($ch, CURLOPT_URL, $Url); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); 
    $output = curl_exec($ch); 
$start = strpos($output, '<pre>'); 
$end = strpos($output, '</pre>', $start); 
$length = $end-$start; 
$output = substr($output, $start, $length);  


curl_close($ch); 

return $output; 
} 

print curl_download('http://athleticsnews.co.za/results/20140207BOLALeague3/140207F006.htm'); 

?> 

這是HTML的樣子,可推動在拉:

<pre> 
AllTrax Timing - Contractor License      4/22/2014 - 8:31 AM 
       Boland Athletics League 3 - 2/7/2014      
         Hosted by Maties AC        
        Coetzenburg, Stellenbosch       

Event 6 Girls 14-15 200 Meter Sprint 

所以我試圖排除前四行加上空行,並開始從刮與事件6啓動線...

+0

你能不能應用正則表達式的捲曲輸出? – Drakes

+0

你可以使用爆炸打破$輸出,並得到你想要的部分 – Babar

+0

哪五行你沒有得到? – Thamaraiselvam

回答

1

您可以使用正則表達式來打散線,並選擇您想要的線路:

$str = curl_download('http://.../140207F006.htm'); 
$re = "/([^\n\r]+)/m"; 
preg_match_all($re, $str, $matches); 
print_r($matches[1]); 

結果:

Array 
(
    [0] => AllTrax Timing - Contractor License      4/22/2014 - 8:31 AM 
    [1] =>      Boland Athletics League 3 - 2/7/2014      
    [2] =>        Hosted by Maties AC        
    [3] =>       Coetzenburg, Stellenbosch       
    [4] => 
    [5] => Event 6 Girls 14-15 200 Meter Sprint 
    [6] => ============================================================================ 
    [7] =>  Name      Age Team     Finals Wind Points 
    [8] => ============================================================================ 
    [9] => Finals                  
    [10] => 1 Shan Fourie     Bola      29.03 NWI 10 
) 

要打印出只有最後5行,你可以做

$matches = $matches[1]; 
$str = ""; 
for($i = 5; $i <= 10; $i++) { 
    $str .= $matches[$i] . PHP_EOL; // Preserve the new line 
} 
echo $str; 

結果:

Event 6 Girls 14-15 200 Meter Sprint 
============================================================================ 
    Name      Age Team     Finals Wind Points 
============================================================================ 
Finals                  
    1 Shan Fourie     Bola      29.03 NWI 10 

演示:http://ideone.com/ijPiP6

+0

正則表達式完美運行,但是如何從[5]到[10]顯示?我不熟悉正則表達式。 – Seef

+0

好問題。請看我更新的答案。請享用。 – Drakes

+0

完美,但我得到了一切。首先輸出喜歡你的第一個結果,然後根據你的第二個結果在上面。 – Seef