我想從一個正式的網頁拉鍊接,我似乎無法通過簡單的谷歌搜索...它可能很簡單,但xpath不是我的專業領域如何拉href鏈接
我使用C#和試圖拉鍊接,只是把它寫到控制檯弄清楚如何獲得鏈接
這裏是我的C#代碼
var document = webGet.Load("http://classifieds.castanet.net/cat/vehicles/cars/0_-_4_years_old/");
var browser = document.DocumentNode.SelectSingleNode("//a[starts-with(@href,'/details/')]");
if (browser != null)
{
string htmlbody = browser.OuterHtml;
Console.WriteLine(htmlbody);
}
的HTML代碼段
<div class="last">…</div><a href="/cat/vehicles/cars/0_-_4_years_old/?p=13">13</a><a href="/cat/vehicles/cars/0_-_4_years_old/?p=2">»</a>
<select name="sortby" class="sortby" onchange="doSort(this);">
<option value="">Most Recent</option>
<option value="of" >Oldest First</option>
<option value="mw" >Most Views</option>
<option value="lw" >Fewest Views</option>
<option value="lp" >Lowest Price</option>
<option value="hp" >Highest Price</option>
</select><div style="clear:both"></div>
</div>
<br /><br /><br />
<a href="/details/2008_vw_gti/1454282/" class="prod_container" >
<h2>2008 VW GTi</h2>
<div style="float:left; width:122px; z-index:1000">
<div class="thumb"><img src="http://c.castanet.net/img/28/thumbs/1454282-1-1.jpg" border="0"/></div>
<div class="clear"></div>
mls
</div>
<div class="descr">
The most fun car I have owned. Dolphin Grey, 4 door, Dual Climate control, DRG Transmission with paddle shift. Leather...
</div>
<div class="pdate">
<p class="price">$19,000.00</p>
<p class="date">Kelowna<br />Posted: Oct 15, 2:54 PM<br />Views: 349</p>
</div>
<div style="clear:both" ></div>
<div class="seal"><img src="/images/bookmark.png" /></div>
</a>
<a href="/details/price_drop_gorgeous_rare_white_2009_honda_accord_ex-l_coupe/1447341/" class="prod_container" >
<h2>PRICE DROP!!! Gorgeous Rare White 2009 Honda Accord EX-L Coupe </h2>
<div style="float:left; width:122px; z-index:1000">
<div class="thumb"><img src="http://c.castanet.net/img/28/thumbs/1447341-1-1.jpg" border="0"/></div>
<div class="clear"></div>
sun2010
</div>
<div class="descr">
我試圖得到的鏈接是「/ details/2008_vw_gti/1454282 /」部分。感謝
我在這裏做了很多假設,因爲你只提供了網站可能包含的一小部分內容,但試試這個:'// a [@ class =「prod_container」]/@ href' – toniedzwiedz