2015-12-02 94 views
3

我有一個工作表,其中包含約8000行已被過濾。我試圖從工作表列中刪除重複項來獲取值的集合。通過閱讀這篇文章,有兩種方法可以做到這一點。如果該值不在新集合中,則循環收集並複製到新集合。
或者將列中的數據複製到臨時電子表格中,過濾並將數據複製到另一列,然後將其添加到集合中。Excel VBA刪除重複vs篩選器

當處理大量數據時,複製過濾器具有最佳性能,但由於必須創建新工作表,因此該過濾器非常笨重。

我還沒有看到它,但是有沒有辦法在內存中做複製過濾器,而不是創建一個工作表來做到這一點?

重述:

Sub GetColumnValues(Ws As Worksheet, Column As Long, CollValues As Collection) 

Dim RowIndex As Long 

    For RowIndex = 1 To GetLastRow(Ws) 
     If CollValues.Count = 0 Then 
      CollValues.Add (Ws.Cells(RowIndex, Column).Value) 
     Else 
      If IsInCollection(CollValues, Ws.Cells(RowIndex, Column).Value) = False Then 
       CollValues.Add (Ws.Cells(RowIndex, Column).Value) 
      End If 
     End If 
    Next RowIndex 

End Sub 

過濾器和複製:

Sub GetColumnValues(Ws As Worksheet, Column As Long, CollValues As Collection) 

Dim rowLast As Long 
Dim c As Range 
Dim tmpWS As Worksheet 
Dim tmpWsName As String 

    tmpWsName = "TempWS" 

    Call DeleteWs(TsWb, tmpWsName) 

    Set tmpWS = TsWb.Sheets.Add() 
    tmpWS.Name = tmpWsName 

    rowLast = GetLastRow(Ws) 

    Ws.Range(Ws.Cells(1, Column), Ws.Cells(rowLast, Column)).Copy 
    tmpWS.Range("A1").PasteSpecial 

    rowLast = GetLastRow(tmpWS) 
    tmpWS.Range(tmpWS.Cells(1, 1), tmpWS.Cells(rowLast, 1)).AdvancedFilter _ 
     Action:=xlFilterCopy, _ 
     CopyToRange:=tmpWS.Range("B1"), _ 
     Unique:=True 

    rowLast = GetLastRow(tmpWS) 

    For Each c In tmpWS.Range(tmpWS.Cells(1, 2), tmpWS.Cells(rowLast, 2)) 
     If Len(c.value) > 0 Then 
      CollValues.Add (c.value) 
     End If 
    Next c 

    Call DeleteWs(TsWb, tmpWsName) 
End Sub 

回答

0

我不知道爲什麼它必須是一個集合,但要獲得快速的讓所有值的數組無(過濾列表),雙打,你可以做這樣的:(非常接近你的第一個例子)

Function GetColVal(Ws As Worksheet, Column As Long) As Variant 
    Dim runner As Variant, outputVal() As Variant, i As Long 
    ReDim outputVal(Ws.Range(Ws.Cells(1, Column), Ws.Cells(GetLastRow(Ws), Column)).SpecialCells(xlCellTypeVisible).Count) 
    For Each runner In Ws.Range(Ws.Cells(1, Column), Ws.Cells(GetLastRow(Ws), Column)).SpecialCells(xlCellTypeVisible) 
    If i = 0 Then 
     outputVal(0) = runner.Value: i = 1 
    Else 
     If IsError(Application.Match(runner.Value, outputVal, 0)) Then outputVal(i) = runner.Value: i = i + 1 
    End If 
    Next 
    ReDim Preserve outputVal(i - 1) 
    GetColVal= outputVal 
End Function 

Application.Match是在VBA最快的功能之一,而IsInCollection可以是非常慢......更好的運行For Each ...循環到集合中添加的一切不是檢查集合...

Dim a As Variant 
For Each a in GetColVal(Worksheets("SheetX"),7) 
    MyCollection.Add a 
Next 

應該比你的例子快得多...仍然我推薦不使用集合,特別是如果你只是使用值...如果可能的話,更好地使用GetColVal -array ...
variantVariable = GetColVal(Worksheets("SheetX"),7)然後使用變量變量你想幹什麼就幹什麼(你也可以粘貼在片直接某處)

一個簡單的輸出到工作表是這樣的:

Dim a As Variant 
a = GetColVal(Worksheets("Sheet1"),13) 'values from sheet1 column M 
'pasting in one row (starting at A1 in Sheet2) 
ThisWorkbook.Sheets("Sheet2").Range(Cells(1, 1), Cells(1, ubound(a) + 1)).value = a 
'pasting in one column (starting at C5 in Sheet4) 
ThisWorkbook.Sheets("Sheet4").Range(Cells(5, 3), Cells(ubound(a) + 5, 3)).value = Application.Transpose(a) 

編輯

要顯示不同的東西:

Function GetColumnValues(ws As Worksheet, Column As Long) As Range 
    With ws 
    Dim srcRng As Range, outRng As Range, runRng1 As Range, runRng2 As Range, dBool As Boolean 
    Set srcRng = .Range(.Cells(1, Column), .Cells(GetLastRow(ws), Column)).SpecialCells(xlCellTypeVisible) 
    For Each runRng1 In a 
     If outRng Is Nothing Then Set outRng = runRng1 
     For Each runRng2 In outRng 
     If Intersect(runRng1, runRng2) Is Nothing Then 
      If runRng2.Value = runRng1.Value Then dBool = True: Exit For 
     End If 
     Next 
     If dBool Then dBool = False Else Set outRng = Union(outRng, runRng1) 
    Next 
    End With 
    Set GetColumnValues = outRng 
End Function 

有了這個功能,你會得到一個範圍內的所有可以選擇或複製到另一個位置(與格式化和其他一切)的細胞。您仍然可以使用For Each ...將所有元素添加到集合中。我也沒有使用Match來避免「Len> 255」 - 錯誤

+0

收集的唯一原因是我從outputVal中刪除重複項後刪除了其他數據。使用集合而不是重新創建新數組更容易。我能想到的唯一方法是創建一個需要刪除的單獨索引數組,然後執行redim並一次複製到新數組。 雖然你已經給了我一些想法,但謝謝。 – SteveP65

+0

我仍在調查此問題。我發現的一個問題是,如果一個Variant數組中有一個字符串長於255個字符,則Application.Match會失敗,並顯示類型不匹配錯誤。如果我將輸出數組更改爲字符串數組,則可以處理整個數據範圍。 – SteveP65

+0

@ SteveP65我添加了另一個功能,你可能想嘗試(應該更快,但我無法確定)...顯然你會得到一個範圍,它可以保留一些你可能想要使用的好處:) –

0

是的,只是做陣列,然後檢查對數組,然後提供您的結果返回到工作表。我個人喜歡在內存中而不是通過應用程序IDE做事情。

它快得多(特別是數萬行),您不必擔心屏幕刷新,或讓用戶想知道如何快速移動所有內容。我通常處理內存中的所有內容,將其交回,然後激活我希望用戶看到的工作表。

dim set1Array() as String 
dim set2Array() as String 
dim set1Rows as Long 
dim set2Rows as Long 
dim lngX as Long 
dim lngY as Long 
dim blnDebug as Boolean; blnDebug = true ' flag for debugging 

' get count of rows so we know how big to make the arrays 
set1Rows = GetLastRow(Ws1) 
set2Rows = GetLastRow(Ws2) 

' set arrays to the proper size 
redim set1Rows(set1Rows - 1, 1)' 1 represents 2 columns since it's 0 based. the second column is a flag for duplicated. 
redim set2Rows(set2Rows - 1, 0)' 0 represents 1 column since it's 0 based 

' load the arrays with the sheet data 
for lngX = 1 to set1Rows 
    set1Rows(lngX - 1, 0) = Worksheets("Sheet1").range("A" & lngX).Text 
next lngX 

for lngX = 1 to set2Rows 
    set2Rows(lngX - 1, 0) = Worksheets("Sheet2").range("A" & lngX).Text 
next lngX 

' I like to do a debug callout here to see what I got to make sure that I am good to go with the dataset 
if blnDebug then 
    for lngX = 0 to Ubound(set1Rows) 
    debug.print "set1Rows(" & lngX & ") - col1: " & set1Rows(lngX, 0) 
    next lngX 

    for lngX = 0 to Ubound(set2Rows) 
    debug.print "set2Rows(" & lngX & ") - col1: " & set2Rows(lngX, 0) 
    next lngX 

end if 

' now do your comparison 

for lngX = 0 to Ubound(set1Rows) 
    for lngY = 0 to Ubound(set2Rows) 
    if set1Rows(lngX, 0) = set2Rows(lngY, 0) then 
     set1Rows(lngX, 1) = "1" 
    end if 
    next lngY 
next lngX 

' now your duplicates are flagged in the set1Rows array 

for lngX = 0 to Ubound(set1Rows) 
    if set1Rows(lngX, 1) = "1" then 
    ' code for duplicated 
    else 
    ' code for unique 
    end if 
next lngX 
+0

正在對一列數據進行比較。所以理想情況下,我想將列複製到數組中,在該數組中刪除任何重複的值。然後我會使用該數組進行額外的處理。 – SteveP65