2011-12-05 51 views
0

我想從htmldocument中提取查詢字符串值。它包含許多帶有查詢字符串參數(稱爲id)的錨鏈接。我想用逗號分隔的字符串來獲取所有的ID。我怎樣才能解決這個問題?所以,我想獲得:結果= {1,2,3,4,5}如何在htmldoc中查找查詢字符串值?

vb.net代碼:

Protected Sub Page_Load(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Load 

     Dim str As String() = GetParagraphs(System.IO.File.ReadAllText(Server.MapPath("TextFile1.html"))) 

     Response.Write(str) 

    End Sub 

    Private Shared Function GetParagraphs(ByVal data As String) As String() 

     Dim result As New List(Of String) 
     Dim m As Match = Regex.Match(data, "http://mywebsite.com/mydetails.aspx?id") 
     While (m.Success) 
      result.Add(m.Value) 
      m = m.NextMatch() 
     End While 
     Return result.ToArray() 
    End Function 

TextFile.html

<a href="http://mywebsite.com/mydetails.aspx?id=1" 
      target="_blank"></a> 

      <a href="http://mywebsite.com/mydetails.aspx?id=2" 
       target="_blank"></a> 


       <a href="http://mywebsite.com/mydetails.aspx?id=3" 
        target="_blank"></a> 


        <a href="http://mywebsite.com/mydetails.aspx?id=4" 
         target="_blank"></a> 


         <a href="http://mywebsite.com/mydetails.aspx?id=5" 
          target="_blank"></a> 

回答

0

您可以使用此修改您的GetParagraphs方法:

Private Shared Function GetParagraphs(ByVal data As String) As String() 

    Dim result As New List(Of String) 
    ' Define what we are looking for 
    Const MY_MATCH As String = "http://mywebsite.com/mydetails.aspx?id=" 
    ' Replace the ? with \? so that regex finds the correct string 
    Dim m As Match = Regex.Match(data, MY_MATCH.Replace("?", "\?")) 
    While (m.Success) 
     Dim wStartIndex As Integer 
     Dim wEndIndex As Integer 

     ' Jump to the end of the found string 
     wStartIndex = m.Index + MY_MATCH.Length 
     ' Now find the end of the href string 
     wEndIndex = data.IndexOf("""", wStartIndex) 
     ' If we found something 
     If wEndIndex <> -1 Then 
      ' Extract the value from the string 
      result.Add(data.Substring(wStartIndex, wEndIndex - wStartIndex)) 
     End If 
     m = m.NextMatch() 
    End While 
    Return result.ToArray() 
End Function