0
我試圖從網站獲取某些特定內容並將其放置在文本文件中。我已經使用了一個列表框用於我想要處理的url的循環,另一個用於查看數據的輸出。現在我希望文本文件中的所有數據由「〜」sysmbol分隔。 http://www.maxpreps.com/high-schools/abbeville-yellowjackets-(abbeville,al)/basketball/previous_seasons.htm將數據以格式寫入文本文件
數據預計將在文本文件中:
〔實施例鏈接我My.txt文件文件中使用
阿布維爾高中籃球統計〜隊:11-12隊打〜顏色:棗紅色,灰色,白色......
Imports System.IO.StreamReader
Imports System.Text.RegularExpressions
Imports System.IO
Public Class Form1
Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
Dim abc As String = My.Computer.FileSystem.ReadAllText("C:\Documents and Settings\Santosh\Desktop\my.txt")
Dim pqr As String() = abc.Split(vbNewLine)
ListBox2.Items.AddRange(pqr)
End Sub
Private Sub Button2_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button2.Click
For Each item In ListBox2.Items
Dim request As System.Net.HttpWebRequest = System.Net.WebRequest.Create(item)
Dim response As System.Net.HttpWebResponse = request.GetResponse
Dim sr As System.IO.StreamReader = New System.IO.StreamReader(response.GetResponseStream())
Dim rsssource As String = sr.ReadToEnd
Dim r As New System.Text.RegularExpressions.Regex("<h1 id=""ctl00_NavigationWithContentOverRelated_ContentOverRelated_Header_Header"">.*</h1>")
Dim r1 As New System.Text.RegularExpressions.Regex("<span id=""ctl00_NavigationWithContentOverRelated_ContentOverRelated_Header_Mascot"">.*</span>")
Dim r3 As New System.Text.RegularExpressions.Regex("<span id=""ctl00_NavigationWithContentOverRelated_ContentOverRelated_Header_Colors"">.*</span>")
Dim r4 As New System.Text.RegularExpressions.Regex("<span id=""ctl00_NavigationWithContentOverRelated_ContentOverRelated_Header_GenderType"">.*</span>")
Dim r5 As New System.Text.RegularExpressions.Regex("<span id=""ctl00_NavigationWithContentOverRelated_ContentOverRelated_Header_AthleteDirectorGenericControl"">.*</span>")
Dim r6 As New System.Text.RegularExpressions.Regex("<address>.*</address>")
Dim r7 As New System.Text.RegularExpressions.Regex("<span id=""ctl00_NavigationWithContentOverRelated_ContentOverRelated_Header_Phone"">.*</span>")
Dim r8 As New System.Text.RegularExpressions.Regex("<span id=""ctl00_NavigationWithContentOverRelated_ContentOverRelated_Header_Fax"">.*</span>")
Dim matches As MatchCollection = r.Matches(rsssource)
Dim matches1 As MatchCollection = r1.Matches(rsssource)
Dim matches3 As MatchCollection = r3.Matches(rsssource)
Dim matches4 As MatchCollection = r4.Matches(rsssource)
Dim matches5 As MatchCollection = r5.Matches(rsssource)
Dim matches6 As MatchCollection = r6.Matches(rsssource)
Dim matches7 As MatchCollection = r7.Matches(rsssource)
Dim matches8 As MatchCollection = r8.Matches(rsssource)
For Each itemcode As Match In matches
Dim W As New IO.StreamWriter("C:\" & FileName.Text & ".txt")
W.Write(itemcode.Value.Split("""").GetValue(2))
W.Close()
'ListBox1.Items.Add(itemcode.Value.Split("""").GetValue(2))
Next
For Each itemcode As Match In matches1
ListBox1.Items.Add(itemcode.Value.Split("""").GetValue(2))
Next
Next item
End Sub
End Class
在上面的代碼爲每個循環僅用於匹配,我希望在文本中匹配〜匹配1〜匹配2〜匹配4 .... – sam 2012-02-08 14:01:54