2011-02-01 26 views
2

我只是無法讓我的頭繞過這個。請幫忙! 我有這個表達式:需要幫助迭代,雖然正則表達式匹配在VB.net正則表達式

(?<=Photo:)(.+?)(?=Stock)|(?<=Stock Code:)(.+?)(?=Make:)|(?<=Make:)(.+?)(?=Model:)|(?<=Model:)(.+?)(?=Year:)|(?<=Year:)(.+?)(?=Price:)|(?<=Price:)(.+?)(?=Description:)|(?<=Description:)(.+?)(?=Photo:)|(?<=Description:)(.+?)(?=Page:) 

而且我得到這個樣本數據:

Photo:http://xxxx.xxx/images/DSC_0039.JPGStock Code:435Make:BMWModel:X5 3.0 I A/TYear:2002Price:169900.00Description:Neat,160000KM 
Photo:http://xxxx.xxx/images/206.JPGStock Code:453Make:Renault Model:Scenic 1.6 Year:2006Price:99900.00Description:Expression 76000km 
Photo:http://xxxx.xxx/images/DSC_0058.JPGStock Code:372Make:Renault Model:ScenicYear:2005Price:89900.00Description:Nice Family Car 
Photo:http://xxxx.xxx/images/j.JPGStock Code:399Make:NissanModel:Micra 1.4Year:2008Price:102900.00Description:Accenta ,neat 
Photo:http://xxxx.xxx/images/207.JPGStock Code:454Make:Renault Model:Scenic 1.6 Year:2001Price:49900.00Description:Expression 185000km 
Photo:http://xxxx.xxx/images/DSC_0040.JPG_dcef66ac215bd9e8c4e3535e458b280b.JPGStock Code:442Make:M/BenzModel:C270 CDIYear:2003Price:122900.00Description:A/T 154000 KM 
Photo:http://xxxx.xxx/images/DSC_0008.JPG_fa489cfd99436c6b9323cfa8e34ed460.JPGStock Code:480Make:Opel AstraModel:2.0 T SportYear:2007Price:154900.00Description:126000KM Black 
Photo:http://xxxx.xxx/images/DSC_0010.JPG_cfe5eb4763cbf568e73697e2cd8dd30e.JPGStock Code:462Make:SeatModel:1.4Year:2008Price:8590.00Description:54000km 
Photo:http://xxxx.xxx/stockimage.jpgStock Code:339Make:BMWModel:320iYear:2005Price:109900.00Description:Man. White 155000 km 
Photo:http://xxxx.xxx/images/192.JPGStock Code:192Make:MitsibushiModel:Colt 2000Year:2008Price:99900.00Description:Workhorse 
Photo:http://xxxx.xxx/images/HPIM1461.JPGStock Code:204Make:FordModel:BroncoYear:1989Price:59900.00Description:Neat 
Photo:http://xxxx.xxx/stockimage.jpgStock Code:445Make:M/BenzModel:Vito 2.2CRDI Year:2006Price:169900.00Description:Crewbus 140000km,White 
Photo:http://xxxx.xxx/images/Picture 384.jpgStock Code:180Make:FiatModel:SienaYear:2000Price:35900.00Description:Family Car 
Photo:http://xxxx.xxx/images/202.JPGStock Code:441Make:MazdaModel:6 2.0 Year:2005Price:99900.00Description:Origenal 104000 km 

我需要遍歷雖然各組獲得每個記錄的匹配內容,然後將其添加到車輛類屬性取決於它是哪個組。

這是我迄今爲止最成功的嘗試。這只是嘗試提取數據的測試,這就是爲什麼我沒有它有點工作(corectly收集數據,每8個記錄):

Dim pattern As String = "(?<=Photo:)(.+?)(?=Stock)|(?<=Stock Code:)(.+?)(?=Make:)|(?<=Make:)(.+?)(?=Model:)|(?<=Model:)(.+?)(?=Year:)|(?<=Year:)(.+?)(?=Price:)|(?<=Price:)(.+?)(?=Description:)|(?<=Description:)(.+?)(?=Photo:)|(?<=Description:)(.+?)(?=\r)" 
     Dim GroupCounter As Integer = 1 
     Dim GroupName As String = "" 

     For Each match As Match In Regex.Matches(html, pattern) 
      If GroupCounter = 1 Then 
       GroupName = "Photo:" 
      ElseIf GroupCounter = 2 Then 
       GroupName = "Stock Code:" 
      ElseIf GroupCounter = 3 Then 
       GroupName = "Make:" 
      ElseIf GroupCounter = 4 Then 
       GroupName = "Model:" 
      ElseIf GroupCounter = 5 Then 
       GroupName = "Year:" 
      ElseIf GroupCounter = 6 Then 
       GroupName = "Price:" 
      ElseIf GroupCounter = 7 Then 
       GroupName = "Desc:" 
      ElseIf GroupCounter = 8 Then 
       GroupName = "Last Desc:" 
      Else 
       GroupName = "Unknown:" 
      End If 


      If match.Groups.Item(GroupCounter).Success And GroupCounter > 0 Then 
       export = export & GroupName & match.Groups.Item(GroupCounter).Value & "|" 
      End If 
      GroupCounter += 1 
      If GroupCounter = 9 Then 
       GroupCounter = 1 
      End If 
     Next 

Firebug的輸出,我得到的是像想什麼,我只是它只返回每8個記錄:

{"d":"Photo:http://xxxx.xxx/images/DSC_0039.JPG|Stock Code:435|Make:BMW|Model:X5 3.0 I A/T|Year:2002|Price:169900.00|Desc:Neat,160000KM|Photo:http://xxxx.xxx/image.jpg|Stock Code:339|Make:BMW|Model:320i|Year:2005|Price:109900.00|Desc:Man. White 155000 km|Photo:http://xxxx.xxx/images/g.JPG|Stock Code:395|Make:V/wagen|Model:Citi 1.4i|Year:2003|Price:49900.00|Desc:A/C|Photo:http://xxxx.xxx/images/1 (2).JPG|Stock Code:402|Make:BMW|Model:530I|Year:2004|Price:169900.00|Desc:Nice Family Car,A/T|Photo:http://xxxx.xxx/images/DSC_0001 (2).JPG_9a8aa2faebf77bcd7f021dc9ef602552.JPG|Stock Code:471|Make:Mitsibushi|Model:Colt 2800 C/Cab 4x4|Year:2005|Price:109900.00|Desc:179000 km|Photo:http:/xxxx.xxx/images/DSC_0011.JPG_5343615443cf449ae70b684c45e0964a.JPG|Stock Code:474|Make:Audi|Model:A3|Year:2005|Price:165900.00|Desc:A3 3.2 QUATRO 6 SPEED|Photo:http://xxxx.xxx/images/HPIM1731.JPG|Stock Code:304|Make:Ford|Model:Laser |Year:1997|Price:35900.00|Desc:Tracer 1.6 Sedan|Photo:http://xxxx.xxx/images/002.JPG|Stock Code:70|Make:PEUGEOT|Model:307|Year:2006|Price:117900.00|Desc:2.0 XS"} 

請幫我 非常感謝 雅克

回答

1

正則表達式我會使用這種情況是

^Photo:(.*?)Stock Code:(.*?)Make:(.*?)Year:(.*?)Price:(.*?)Description:(.*?)$ 

啓用了RegexOptions.Multiline。對於每一行,它都會在其捕獲權限中包含相關數據。 不幸的是,我的VB.NET不僅不穩定。我將在C#中給出一個簡短的片段。請隨意編輯一個VB版本。

String data = "Phtoto: ....."; 
String pattern = "^Photo:(.*?)Stock Code:(.*?)Make:(.*?)Year:(.*?)Price:(.*?)Description:(.*?)$"; 

MatchCollection matches = Regex.Matches(data, pattern, RegexOptions.Multiline); 
foreach (Match match in matches) 
{ 
    YourObject item = new YourObject(); 
    item.Photo = match.Groups[1].Value; 
    item.StockCode = match.Groups[2].Value; 
    // .... 
} 
1

您的正則表達式一次只匹配一個字段,當它應該匹配整個記錄時。當您可以使用命名組時,不需要按數字遍歷組併爲其分配名稱。我不說話VB,所以這裏是在C#中的例子:

Regex r = new Regex(@" 
     Photo:(?<Photo>.+?) 
     Stock\s+Code:(?<StockCode>.+?) 
     Make:(?<Make>.+?) 
     Model:(?<Model>.+?) 
     Year:(?<Year>.+?) 
     Price:(?<Price>.+?) 
     Description:(?<Description>[^\r\n]+)", 
    RegexOptions.IgnorePatternWhitespace); 
    foreach (Match m in r.Matches(data)) 
    { 
    Console.WriteLine(); 
    foreach (string name in r.GetGroupNames()) 
    { 
     Console.WriteLine("{0} = {1}", name, m.Groups[name]); 
    } 
    } 

除了你指定的名字,總會有一個命名爲「0」,代表組整場比賽。

在附註中,我注意到您使用了(.+?)(?=\r)來匹配最終字段。我假設你這樣做是因爲記錄之間用\r\n分開,你不想在比賽中包含\r。但是,如果數據的製作者改變了格式,那麼線路以\n結束,並且未能通知您?突然你的正則表達式不再起作用了,你看不出爲什麼。如果您像我一樣使用[^\r\n]+,則不必擔心這一點。