2016-01-31 88 views
0

我用下面的代碼:提取HTML鏈接的href與

$URI = $webpage 
$HTML = Invoke-WebRequest -Uri $URI 
$price = ($HTML.ParsedHtml.getElementsByTagName("div") | Where { 
      $_.className -eq 'price' 
     }).innerText 

從這個HTML代碼中提取信息:

<div class="price"> 
    <span class="accessibility">Purchase price</span> 
    <small>$</small>3.50 
</div> 

,但我似乎無法提取從網址遵循使用.ParsedHtml.getElementsByTagName()的HTML代碼。

</a> 
<div class="detail">  
    <span role="presentation" aria-hidden="true" class="brand">Steves Fresh</span> 
    <span class="item" role="presentation" aria-hidden="true"> 
     <a role="presentation" aria-hidden="true" class="product-url" href="http://testurlnodetailsshown.com.au"> 
     Steves&nbsp; 
     2 pack 
     </a> 

回答

0

選擇<div>元素,然後從結果中選擇<a>元素(S):

$HTML.ParsedHtml.getElementsByTagName('div') | 
    Where-Object { $_.className -eq 'detail' } | 
    ForEach-Object { $_.getElementsByTagName('a') } | 
    Where-Object { $_.className -eq 'product-url' } | 
    Select-Object -Expand href 
+0

謝謝Ansgar,工作的網頁有很多鏈接 – Clappy101

0

如果有與「產品網址」級只有一個鏈接,你可以使用:

$html.Links | 
Where-Object { $_.class -eq 'product-url' } | 
Select-Object -ExpandProperty href