2013-07-26 94 views
0

我正在開發一個網站,該網站允許用戶搜索'x'產品並以表格格式顯示結果。PHP curl。遍歷搜索結果

我打算從另一個網站使用php curl抓取搜索數據。 (被抓的網站的所有者知道並允許它,所以沒有法律問題)。

我已經有一個php捲曲代碼去登錄到網站,並根據用戶輸入做搜索。我不知道如何通過搜索和輸出的結果,然後在我的網站上逐一。

PHP捲曲代碼:

$username = '********'; 
$password = '********'; 
$loginUrl = 'http://www.a-website.com/login.asp'; 

//init curl 
$ch = curl_init(); 

//Set the URL to work with 
curl_setopt($ch, CURLOPT_URL, $loginUrl); 

// ENABLE HTTP POST 
curl_setopt($ch, CURLOPT_POST, 1); 

//Set the post parameters 
curl_setopt($ch, CURLOPT_POSTFIELDS, 'username=' . $username . '&password=' . $password . '&submit1=' . 'Login'); 

//Handle cookies for the login 
curl_setopt($ch, CURLOPT_COOKIEJAR, 'cookie stuff hure'); 

//Setting CURLOPT_RETURNTRANSFER variable to 1 will force cURL 
//not to print out the results of its query. 
//Instead, it will return the results as a string return value 
//from curl_exec() instead of the usual true/false. 
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 

//execute the request (the login) 
$store = curl_exec($ch); 

/*   * *****************SEARCH HERE****************** */ 
curl_setopt($ch, CURLOPT_URL, 'http://www.a-website.com/Index.asp'); 
//execute the request 
$content = curl_exec($ch); 


//Set the post parameters 

curl_setopt($ch, CURLOPT_POSTFIELDS, 'search_txt_vs=' . '' . '&search_txt_UPC=' . '' . '&search_txt_Name=' . $searchString . 
     '&search_txt_Manufacturer=' . '' . '&submit=' . 'Search'); 
//execute the request (the search) 
$Search = curl_exec($ch); 

print CJSON::encode($Search); 
print $Search; 

//save the data to disk 
print $content; 

下面是從網站進出口報廢的HTML代碼(順便說一句是老校表格式)

<td colspan="3" height="100%" valign="top"> 
    <table width="100%" border="0" cellpadding="2" cellspacing="0" bordercolor="#99CCCC" class="text"> 
     <tbody> 
      <tr bgcolor="#9999CC"> 
       <td align="right" class="calendar">Sort &gt;</td> 
       <td align="center"> <a href="Index.asp?search_txt_UPC=&amp;search_txt_Name=novolin&amp;search_txt_Manufacturer=&amp;orderby=1">NDC</a> 
       </td> 
       <td align="left"> <a href="Index.asp?search_txt_UPC=&amp;search_txt_Name=novolin&amp;search_txt_Manufacturer=&amp;orderby=2">Brand Name</a> 
       </td> 
       <td align="center" colspan="2"> <a href="Index.asp?search_txt_UPC=&amp;search_txt_Name=novolin&amp;search_txt_Manufacturer=&amp;orderby=3">Strength</a> 
&nbsp;|&nbsp; <a href="Index.asp?search_txt_UPC=&amp;search_txt_Name=novolin&amp;search_txt_Manufacturer=&amp;orderby=4">UD</a> 
       </td> 
       <td align="left"> <a href="Index.asp?search_txt_UPC=&amp;search_txt_Name=novolin&amp;search_txt_Manufacturer=&amp;orderby=5">Stock</a> 
       </td> 
       <td align="center"> <a href="Index.asp?search_txt_UPC=&amp;search_txt_Name=novolin&amp;search_txt_Manufacturer=&amp;orderby=6">Manufacturer</a> 
       </td> 
       <td align="center" bgcolor="cccccc"> <a href="Index.asp?search_txt_UPC=&amp;search_txt_Name=novolin&amp;search_txt_Manufacturer=&amp;orderby=7">AWP</a> 
&nbsp;/&nbsp; <a href="Index.asp?search_txt_UPC=&amp;search_txt_Name=novolin&amp;search_txt_Manufacturer=&amp;orderby=8">Your Price</a> 
       </td> 
      </tr> 
      <tr bgcolor="#9999CC"> 
       <td align="right" class="calendar">&nbsp;</td> 
       <td align="center"> <a href="Index.asp?search_txt_UPC=&amp;search_txt_Name=novolin&amp;search_txt_Manufacturer=&amp;orderby=9">UPC</a> 
       </td> 
       <td align="left"> <a href="Index.asp?search_txt_UPC=&amp;search_txt_Name=novolin&amp;search_txt_Manufacturer=&amp;orderby=10">Generic Alt/Name</a> 
       </td> 
       <td align="center" colspan="2"> <a href="Index.asp?search_txt_UPC=&amp;search_txt_Name=novolin&amp;search_txt_Manufacturer=&amp;orderby=11">Size</a> 
&nbsp;|&nbsp; <a href="Index.asp?search_txt_UPC=&amp;search_txt_Name=novolin&amp;search_txt_Manufacturer=&amp;orderby=12">Form</a> 
       </td> 
       <td align="left" colspan="3" class="selected">Category</td> 
      </tr> 
      <tr bgcolor="eeeeee"> 
       <td align="center" valign="top" rowspan="2">1 
        <br> <a href="#" onclick="return openCart(19112,0.01021);"><span class="smallNorm_red">[add]</span></a> 

       </td> 
       <td align="center"><span class="smallNorm">00169347718</span> 
       </td> 
       <td align="left"><span class="smallNorm_red">NOVOLIN 70/ 30U/ML CRT 5X3 ML</span> 
       </td> 
       <td align="center" colspan="2"><span class="smallNorm"> 70-30 U/ML</span> 
       </td> 
       <td align="left"><span class="smallNorm">YES</span> 
       </td> 
       <td align="center"><span class="smallNorm">NOVO NORDISK PHARM</span> 
       </td> 
       <td align="center"><span class="smallNorm">$ 

    0.01&nbsp; 

    &nbsp;/&nbsp;$ 

    0.01 

    </span> 
       </td> 
      </tr> 
      <tr bgcolor="eeeeee"> 
       <td align="center"><span class="smallNorm">000000000000</span> 
       </td> 
       <td align="left"><span class="smallNorm"><a href="#" onclick="return openGeneric('50101');">HUM INSULIN NPH/REG INSULIN HM</a></span> 
       </td> 
       <td align="center" colspan="2"><span class="smallNorm"> 5X3ML </span> 
       </td> 
       <td align="left" colspan="3"><span class="smallNorm">&nbsp; 

    <a href="#" onclick="return openreturn(19112,0.01021);"><span class="smallNorm_red">[return]</span> 
        </a>INSULIN</span> 
       </td> 
      </tr> 
      <tr bgcolor="#99CCCC"> 
       <td align="center" valign="top" rowspan="2">2 
        <br> <a href="#" onclick="return openCart(19116,0.012);"><span class="smallNorm_red">[add]</span></a> 

       </td> 
       <td align="center"><span class="smallNorm">00169347418</span> 
       </td> 
       <td align="left"><span class="smallNorm_red">NOVOLIN N 100 UN/ML CRT 5X3 ML</span> 
       </td> 
       <td align="center" colspan="2"><span class="smallNorm"> 100 U/ML</span> 
       </td> 
       <td align="left"><span class="smallNorm">YES</span> 
       </td> 
       <td align="center"><span class="smallNorm">NNP</span> 
       </td> 
       <td align="center"><span class="smallNorm">$ 

     0.00&nbsp; 

    &nbsp;/&nbsp;$ 

    0.01 

    </span> 
       </td> 
      </tr> 
      <tr bgcolor="#99CCCC"> 
       <td align="center"><span class="smallNorm">000000000000</span> 
       </td> 
       <td align="left"><span class="smallNorm"><a href="#" onclick="return openGeneric('05331');">NPH HUMAN INSULIN ISOPHANE</a></span> 
       </td> 
       <td align="center" colspan="2"><span class="smallNorm"> 5X3ML </span> 
       </td> 
       <td align="left" colspan="3"><span class="smallNorm">&nbsp; 

    <a href="#" onclick="return openreturn(19116,0.012);"><span class="smallNorm_red">[return]</span> 
        </a>INSULIN</span> 
       </td> 
      </tr> 
      <tr bgcolor="eeeeee"> 
       <td align="center" valign="top" rowspan="2">3 
        <br> <a href="#" onclick="return openCart(45211,0.012);"><span class="smallNorm_red">[add]</span></a> 

       </td> 
       <td align="center"><span class="smallNorm">00169231721</span> 
       </td> 
       <td align="left"><span class="smallNorm_red">NOVOLIN INNO 70/30 PFS 5X3 ML</span> 
       </td> 
       <td align="center" colspan="2"><span class="smallNorm"> 70-30 U/ML</span> 
       </td> 
       <td align="left"><span class="smallNorm">YES</span> 
       </td> 
       <td align="center"><span class="smallNorm">NOVO NORDISK PHARM</span> 
       </td> 
       <td align="center"><span class="smallNorm">$ 


     0.00&nbsp; 

    &nbsp;/&nbsp;$ 

    0.01 

    </span> 
       </td> 
      </tr> 
      <tr bgcolor="eeeeee"> 
       <td align="center"><span class="smallNorm">000000000000</span> 
       </td> 
       <td align="left"><span class="smallNorm"><a href="#" onclick="return openGeneric('24486');">HUM INSULIN NPH/REG INSULIN HM</a></span> 
       </td> 
       <td align="center" colspan="2"><span class="smallNorm"> 5X3ML </span> 
       </td> 
       <td align="left" colspan="3"><span class="smallNorm">&nbsp; 

    <a href="#" onclick="return openreturn(45211,0.012);"><span class="smallNorm_red">[return]</span> 
        </a>INSULIN</span> 
       </td> 
      </tr> 
      <tr bgcolor="#99CCCC"> 
       <td align="center" valign="top" rowspan="2">4 
        <br> <a href="#" onclick="return openCart(19117,82.0884);"><span class="smallNorm_red">[add]</span></a> 

       </td> 
       <td align="center"><span class="smallNorm">00169183311</span> 
       </td> 
       <td align="left"><span class="smallNorm_red">NOVOLIN R 100 UN/ML VL 10 ML</span> 
       </td> 
       <td align="center" colspan="2"><span class="smallNorm"> 100 U/ML</span> 
       </td> 
       <td align="left"><span class="smallNorm">YES</span> 
       </td> 
       <td align="center"><span class="smallNorm">NOVO NORDISK PHARM</span> 
       </td> 
       <td align="center"><span class="smallNorm">$ 

    99.00&nbsp; 

    &nbsp;/&nbsp;$ 

    82.09 

    </span> 
       </td> 
      </tr> 
      <tr bgcolor="#99CCCC"> 
       <td align="center"><span class="smallNorm">000169183311</span> 
       </td> 
       <td align="left"><span class="smallNorm"><a href="#" onclick="return openGeneric('11642');">INSULIN REGULAR HUMAN</a></span> 
       </td> 
       <td align="center" colspan="2"><span class="smallNorm"> 10ML </span> 
       </td> 
       <td align="left" colspan="3"><span class="smallNorm">&nbsp; 

    <a href="#" onclick="return openreturn(19117,82.0884);"><span class="smallNorm_red">[return]</span> 
        </a>INSULIN</span> 
       </td> 
      </tr> 
      <tr bgcolor="eeeeee"> 
       <td align="center" valign="top" rowspan="2">5 
        <br> <a href="#" onclick="return openCart(19110,82.0884);"><span class="smallNorm_red">[add]</span></a> 

       </td> 
       <td align="center"><span class="smallNorm">00169183711</span> 
       </td> 
       <td align="left"><span class="smallNorm_red">NOVOLIN 70/ 30U/ML VL 10 ML</span> 
       </td> 
       <td align="center" colspan="2"><span class="smallNorm"> 70-30 U/ML</span> 
       </td> 
       <td align="left"><span class="smallNorm">YES</span> 
       </td> 
       <td align="center"><span class="smallNorm">NOVO NORDISK PHARM</span> 
       </td> 
       <td align="center"><span class="smallNorm">$ 

    99.00&nbsp; 

    &nbsp;/&nbsp;$ 

    82.09 

    </span> 
       </td> 
      </tr> 
      <tr bgcolor="eeeeee"> 
       <td align="center"><span class="smallNorm">000169183711</span> 
       </td> 
       <td align="left"><span class="smallNorm"><a href="#" onclick="return openGeneric('50001');">HUM INSULIN NPH/REG INSULIN HM</a></span> 
       </td> 
       <td align="center" colspan="2"><span class="smallNorm"> 10ML </span> 
       </td> 
       <td align="left" colspan="3"><span class="smallNorm">&nbsp; 

    <a href="#" onclick="return openreturn(19110,82.0884);"><span class="smallNorm_red">[return]</span> 
        </a>INSULIN</span> 
       </td> 
      </tr> 
      <tr bgcolor="#99CCCC"> 
       <td align="center" valign="top" rowspan="2">6 
        <br> <a href="#" onclick="return openCart(19114,82.0884);"><span class="smallNorm_red">[add]</span></a> 

       </td> 
       <td align="center"><span class="smallNorm">00169183411</span> 
       </td> 
       <td align="left"><span class="smallNorm_red">NOVOLIN N 100 UN/ML VL 10 ML</span> 
       </td> 
       <td align="center" colspan="2"><span class="smallNorm"> 100 U/ML</span> 
       </td> 
       <td align="left"><span class="smallNorm">YES</span> 
       </td> 
       <td align="center"><span class="smallNorm">NOVO NORDISK PHARM</span> 
       </td> 
       <td align="center"><span class="smallNorm">$ 

    99.00&nbsp; 

    &nbsp;/&nbsp;$ 

    82.09 

    </span> 
       </td> 
      </tr> 
      <tr bgcolor="#99CCCC"> 
       <td align="center"><span class="smallNorm">000000000000</span> 
       </td> 
       <td align="left"><span class="smallNorm"><a href="#" onclick="return openGeneric('11660');">NPH HUMAN INSULIN ISOPHANE</a></span> 
       </td> 
       <td align="center" colspan="2"><span class="smallNorm"> 10ML </span> 
       </td> 
       <td align="left" colspan="3"><span class="smallNorm">&nbsp; 

    <a href="#" onclick="return openreturn(19114,82.0884);"><span class="smallNorm_red">[return]</span> 
        </a>INSULIN</span> 
       </td> 
      </tr> 
     </tbody> 
    </table> 
</td> 

回答

1

你可以嘗試添加字符串到DOMDocument並使用getElementsByTagName,然後將它們寫入數組或您可以使用的某個東西。點擊此處瞭解詳情:http://php.net/manual/en/domdocument.getelementsbytagname.php

而且,類似的問題在這裏回答,考慮到你返回HTML:PHP parse HTML tags

+0

嘿感謝您的回答首先的。但你能更具體一點嗎?也許是一個例子。我的主要問題是輸出沒有任何標籤名稱。它只是表格內部的表格。那麼我該如何區分網站的其他部分和我想要展示的實際項目? – ecorvo