1
我試圖創建一個簡單的應用程序,它基本上是用來在幾個網站比較的東西。我已經看到了一些將所有文本提取到應用程序的方法。但有沒有什麼方法可以提取說,只有標題和說明。刮從網站的特定文字應用VB的
拿一本書,網站作爲一個例子。無論如何搜索書籍標題,然後顯示所有不同的評論,簡介,價格,而沒有任何不友好的文本呢?
我試圖創建一個簡單的應用程序,它基本上是用來在幾個網站比較的東西。我已經看到了一些將所有文本提取到應用程序的方法。但有沒有什麼方法可以提取說,只有標題和說明。刮從網站的特定文字應用VB的
拿一本書,網站作爲一個例子。無論如何搜索書籍標題,然後顯示所有不同的評論,簡介,價格,而沒有任何不友好的文本呢?
一個快速而簡單的解決方案是使用WebBrowser,通過它的.Document
屬性公開HtmlDocument。
Public Class Form1
Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
Me.WebBrowser1.ScriptErrorsSuppressed = True
Me.WebBrowser1.Navigate(New Uri("http://stackoverflow.com/"))
End Sub
Private Sub WebBrowser1_DocumentCompleted(sender As Object, e As WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted
Dim document As HtmlDocument = Me.WebBrowser1.Document
Dim title As String = Me.GetTitle(document)
Dim description As String = Me.GetMeta(document, "description")
Dim keywords As String = Me.GetMeta(document, "keywords")
Dim author As String = Me.GetMeta(document, "author")
End Sub
Private Function GetTitle(document As HtmlDocument) As String
Dim head As HtmlElement = Me.GetHead(document)
If (Not head Is Nothing) Then
For Each el As HtmlElement In head.GetElementsByTagName("title")
Return el.InnerText
Next
End If
Return String.Empty
End Function
Private Function GetMeta(document As HtmlDocument, name As String) As String
Dim head As HtmlElement = Me.GetHead(document)
If (Not head Is Nothing) Then
For Each el As HtmlElement In head.GetElementsByTagName("meta")
If (String.Compare(el.GetAttribute("name"), name, True) = 0) Then
Return el.GetAttribute("content")
End If
Next
End If
Return String.Empty
End Function
Private Function GetHead(document As HtmlDocument) As HtmlElement
For Each el As HtmlElement In document.GetElementsByTagName("head")
Return el
Next
Return Nothing
End Function
End Class