2015-04-26 63 views
0

我想在VB.NET中製作代理抓取器對於http://nntime.com/頁面 任何人都可以幫忙嗎?如何在VB.NET中製作代理抓取器http://nntime.com/頁面

Imports System.Text.RegularExpressions 
Public Class Form1 
Private Sub Button4_Click(sender As Object, e As EventArgs) Handles Button4.Click 
    Me.Close() 
End Sub 

Private Sub Button3_Click(sender As Object, e As EventArgs) Handles Button3.Click 
    ListBox1.Items.Clear() 
End Sub 

Private Sub Button2_Click(sender As Object, e As EventArgs) Handles Button2.Click 
    Dim sw As IO.StreamWriter 
    Dim itms() As String = {ListBox1.Items.ToString} 
    Dim save As New SaveFileDialog 
    Dim it As Integer 
    save.FileName = "Grabbed Proxies" 
    save.Filter = "Grabbed Proxies (*.txt)|*.txt|ALL Files (*.*)|*.*" 
    save.CheckPathExists = True 
    save.ShowDialog(Me) 
    sw = New IO.StreamWriter(save.FileName) 
    For it = 0 To ListBox1.Items.Count - 1 
     sw.WriteLine(ListBox1.Items.Item(it)) 
    Next 
    sw.Close() 
End Sub 

Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click 
    Dim the_request As System.Net.HttpWebRequest = System.Net.HttpWebRequest.Create("http://proxy-ip-list.com") 
    'creating the httpwebresponce 
    Dim the_response As System.Net.HttpWebResponse = the_request.GetResponse 
    'defining the stream reader to read the data from the httpwebresponse 
    Dim stream_reader As System.IO.StreamReader = New System.IO.StreamReader(the_response.GetResponseStream()) 
    'defining a string to stream reader fisnished streaming 
    Dim code As String = stream_reader.ReadToEnd 
    'haha here we use the regex 
    Dim expression As New System.Text.RegularExpressions.Regex("[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}:[0-9]{1,4}") 
    'adding the proxies to the listbox 
    Dim mtac As MatchCollection = expression.Matches(code) 
    For Each itemcode As Match In mtac 
     ListBox1.Items.Add(itemcode) 
    Next 
End Sub 

,但沒有工作的http://nntime.com/

感謝提前:)

+0

你在哪裏卡住了? – Abhishek

+0

只獲取IP。 無法獲取端口 – graham

+0

你讀過[this](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454)? –

回答

0

這裏是一個hidemyass例如:

For Each s As String() In Regex.Matches(New WebClient().DownloadString("http://proxylist.hidemyass.com/"), "(?:<td class=""leftborder timestamp""(?s).+?<style>)((?s).+?)\s*<td>\s+(\d{2,5})</td>").Cast(Of Match)().[Select](Function(m) New String() {m.Groups(1).Value, m.Groups(2).Value}) 
     Regex.Matches(s(0), "\.([^\{]+)\{([^\}]+)\}").Cast(Of Match)().ToList().ForEach(Function(m) InlineAssignHelper(s(0), s(0).Replace(String.Format("class=""{0}""", m.Groups(1).Value), String.Format("style=""{0}""", m.Groups(2).Value)))) 
     ListBox1.Items.Add(String.Concat(Regex.Matches(Regex.Replace(Regex.Replace(s(0), "<(span|div) style=""display:none"">[\d\.]+</\1>", String.Empty).Remove(0, s(0).IndexOf("/style>")), "class=""\d+""", String.Empty), "[\d\.]+").Cast(Of Match)().[Select](Function(m) m.Value)) & ":" & s(1)) 
    Next 
+0

感謝馬修Dotcom – graham