這個問題是非常適合於正則表達式。以下函數返回給定字符串中簡單正則表達式模式第一次匹配之前的字符位置。如果找不到匹配,則函數返回字符串的長度。該功能可以與LEFT功能結合以提取比賽前的文字。 (使用LEFT,因爲,爲了簡單起見,這個功能沒有實現子匹配是必要)
下面的公式將提取產品的名稱在您的樣本數據:
=LEFT(A1,regexmatch(A1," \(|\/| -| \*"))
最新向下匹配模式" \(|\/| -| \*"
:
" \(" matches a space followed by a left parenthesis
[the backslash escapes the "(", a special character in regular expressions]
"|" signifies an alternative pattern to match
"\/" matches a forward slash (/)
" -" matches a space followed by a dash (-)
" \*" matches a space followed by an asterisk (*).
要了解更多關於正則表達式,可以看到在網絡上此regular expression tutorial,一個不少。
爲了使該功能起作用,您需要設置對Microsoft VBScript Regular Expressions 5.5的引用。要做到這一點,從VBA IDE中選擇Tools/References,並檢查這個項目,這將在很長的參考文獻列表中。
Function regexMatch(text As String, rePattern As String)
'Response to SO post 16591260
'Adapted from code at http://www.macrostash.com/2011/10/08/
' simple-regular-expression-tutorial-for-excel-vba/.
Dim regEx As New VBScript_RegExp_55.RegExp
Dim matches As Variant
regEx.pattern = rePattern
regEx.IgnoreCase = True 'True to ignore case
regEx.Global = False 'Return just the first match
If regEx.Test(text) Then
Set matches = regEx.Execute(text)
regexMatch = matches(0).FirstIndex
Else
regexMatch = Len(text)
End If
End Function
以下子例程將字符串提取應用於指定數據列中的每個單元格,並將新字符串寫入指定的結果列。儘管可能只是爲數據列中的每個單元調用該函數,但每次調用該函數時都會產生編譯正則表達式(適用於所有單元)的開銷。爲避免這種開銷,子例程將匹配函數分成兩部分,循環之外的模式定義通過數據單元,循環內部執行模式。
Sub SubRegexMatch()
'Response to SO post 16591260
'Extracts from string content of each data cell in a specified source
' column of the active worksheet the characters to the left of the first
' match of a regular expression, and writes the new string to corresponding
' rows in a specified result column.
'Set the regular expression, source column, result column, and first
' data row in the "parameters" section
'Regex match code was adapted from http://www.macrostash.com/2011/10/08/
' simple-regular-expression-tutorial-for-excel-vba/
Dim regEx As New VBScript_RegExp_55.RegExp, _
matches As Variant, _
regexMatch As Long 'position of character *just before* match
Dim srcCol As String, _
resCol As String
Dim srcRng As Range, _
resRng As Range
Dim firstRow As Long, _
lastRow As Long
Dim srcArr As Variant, _
resArr() As String
Dim i As Long
'parameters
regEx.Pattern = " \(|\/| -| \*" 'regular expression to be matched
regEx.IgnoreCase = True
regEx.Global = False 'return only the first match found
srcCol = "A" 'source data column
resCol = "B" 'result column
firstRow = 2 'set to first row with data
With ActiveSheet
lastRow = .Cells(Cells.Rows.Count, srcCol).End(xlUp).Row
Set srcRng = .Range(srcCol & firstRow & ":" & srcCol & lastRow)
Set resRng = .Range(resCol & firstRow & ":" & resCol & lastRow)
srcArr = srcRng
ReDim resArr(1 To lastRow - firstRow + 1)
For i = 1 To srcRng.Rows.Count
If regEx.Test(srcArr(i, 1)) Then
Set matches = regEx.Execute(srcArr(i, 1))
regexMatch = matches(0).FirstIndex
Else
regexMatch = Len(srcArr(i, 1)) 'return length of original string if no match
End If
resArr(i) = Left(srcArr(i, 1), regexMatch)
Next i
resRng = WorksheetFunction.Transpose(resArr) 'assign result to worksheet
End With
End Sub
我一直在試圖例如各種編碼組合InStr,Mid,LEFT等,但我似乎無法做到。我無法正確地獲得循環定義,以及如何查找字符,提取到相鄰單元格,移至下一個單元格,等等。 – Kinchit