2016-08-03 69 views
0

我是新的Excel VBA代碼,我需要幫助優化此代碼。它完全符合我的要求,但需要將近30秒才能運行,這對最終用戶來說是不可接受的。Optomizing Excel多維循環的VBA

目的是評估一個單詞與輸入是句子的頻率。在「Raw」表中,第一列是整個句子。第二個是句子中有多少單詞的計數。第三百是句子中的第一,第二,第三......字。一次最多可分析1000個句子。

然後,只有它們是唯一的,它纔會粘貼到「OneColumn」的第一列中。我嘗試粘貼所有內容,然後刪除重複內容,但運行時間大約爲45秒。

我當然願意用其他方式來分析一個單詞的使用頻率,但我無法弄清楚如何讓它在單元格內檢查而不打破它們。

我將不勝感激任何幫助。

Option Explicit 

Sub ListUniqueWords() 
Dim i As Integer 
Dim j As Integer 
Dim k As Integer 

Dim StartTime As Double 
Dim SecondsElapsed As Double 
    StartTime = Timer 

i = 2 
j = 3 
k = 2 

'i=row j=column k=paste into row 

    Do While i < 1001 
    j = 3 
      Do While j < 103 
          If Sheets("Raw").Cells(i, j).Value <> "" And Sheets("Raw").Cells(i, j).Value <> " " And Sheets("OneColumn").Range("A2:A2000").Find(Sheets("Raw").Cells(i, j), LookAt:=xlWhole) Is Nothing Then 
            Worksheets("Raw").Activate 
            Cells(i, j).Select 
            Selection.Copy 
            Worksheets("OneColumn").Activate 
            Cells(k, 1).Activate 
            ActiveCell.PasteSpecial Paste:=xlPasteValues 
            k = k + 1 
            j = j + 1 
           Else 
            j = j + 1 
           End If 
      Loop 
      i = i + 1 
    Loop 
SecondsElapsed = Round(Timer - StartTime, 2) 
    MsgBox "This code ran successfully in " & SecondsElapsed & " seconds", vbInformation 

End Sub 
+1

如果您的代碼正常工作,但您希望它得到改進,那麼您應該考慮刪除此處的問題並將其發佈到http://codereview.stackexchange.com/。 –

+0

我甚至不知道存在。非常感謝你John! –

+0

知道該網站存在是有用的,但現在你已經至少有一個答案,你應該把它留在這裏,因爲交叉發佈是不受歡迎的。 –

回答

0

我打算假定所有的句子都是單行,並且在單詞之間包含一個空格。將名爲「輸出」的工作表添加到工作簿中。在單元格A1中鍵入一個標題(例如「Word」),並在單元格B2中鍵入一個標題(例如「Count」)。下面的句子將輸出A列中的單詞和B列中單詞的計數,然後進行排序,最常用的是排在最前面。根據你有多少數據,這應該需要一兩秒鐘才能運行。

注意:您需要添加對「Microsoft Scripting Runtime」庫的引用。

Sub Example() 
Dim X As Variant, S As Variant, key As Variant 
Dim str As String 
Dim oDict As Scripting.Dictionary 
Dim i As Double, j As Double, k As Double 
Dim Anchor As Range 

Set oDict = New Scripting.Dictionary 

With ThisWorkbook 
    'Clear past output 
    With .Sheets("Output") 
     .Range("a2:" & .Cells(.Rows.Count, .Columns.Count).Address).ClearContents 
    End With 

'Fill array with text to search 
    With .Sheets("Raw") 
     X = .Range("a2:a" & .Range("a" & .Rows.Count).End(xlUp).Row).Value2 
    End With 
End With 

For i = LBound(X,1) To UBound(X,1) 
    S = Split(X(i,1), " ") 

    For j = LBound(S, 1) To UBound(S, 1) 
     If oDict.Exists(S(j)) Then 
      oDict.Item(S(j)) = oDict.Item(S(j)) + 1 
     Else 
      oDict.Add S(j), 1 
     End If 
    Next j 
Next i 

'Output results to sheet "Output" 
With ThisWorkbook.Sheets("Output") 
For Each key In oDict.Keys 
    Set Anchor = .Range("a" & Rows.Count).End(xlUp).Offset(1, 0) 
    Anchor = key 
    Anchor.Offset(0, 1) = oDict.Item(key) 
Next key 

.Range("a1:" & .Range("a" & .Rows.Count).End(xlUp).Offset(0, 1).Address).Sort .Range("b:b"), xlDescending 
End With 

End Sub 

編輯:

這是我完全的,純粹的代碼。請注意,工作簿和工作表參考不會根據您的目的進行更新。要使用RegExp,您需要添加對「Microsoft VBScript Regular Expressions 5.5」庫的引用。我使用「5.5」,但我相信任何會爲此工作。

Sub Example() 
Dim X As Variant, S As Variant, S2 As Variant, S3 As Variant, key As  Variant 
Dim oDict As Scripting.Dictionary 
Dim i As Double, j As Double, k As Double 
Dim Anchor As Range 
Dim oReg As New RegExp 
Dim str As String 
Dim st As Single 

Application.ScreenUpdating = False 


st = Timer 
Set oDict = New Scripting.Dictionary 

With ThisWorkbook 
'Clear past output 
    With .Sheets("Output") 
     .Range("a2:" & .Cells(.Rows.Count, .Columns.Count).Address).ClearContents 
    End With 

    'Fill array with text to search 
    With .Sheets("Input") 
     X = .Range("a2:a" & .Range("a" & .Rows.Count).End(xlUp).Row).Value2 
    End With 
End With 

With oReg 
    .Global = True 
    .IgnoreCase = True 
    .Pattern = "[^\w\s]" 
End With 

For i = LBound(X, 1) + 1 To UBound(X, 1) 
    'Get rid of none letter and white space 
      str = oReg.Replace(X(i, 1), "") 


    S = Split(str, " ") 

    For j = LBound(S, 1) To UBound(S, 1) 
     If oDict.Exists(LCase(S(j))) Then 
      oDict.Item(LCase(S(j))) = oDict.Item(LCase(S(j))) + 1 
     Else 
      oDict.Add LCase(S(j)), 1 
     End If 
    Next j 
Next i 

'Output results to sheet "Output" 
With ThisWorkbook.Sheets("Output") 
    For Each key In oDict.Keys 
     Set Anchor = .Range("a" & Rows.Count).End(xlUp).Offset(1, 0) 
     Anchor = key 
     Anchor.Offset(0, 1) = oDict.Item(key) 
    Next key 

     .Range("a1:" & .Range("a" & .Rows.Count).End(xlUp).Offset(0, 1).Address).Sort .Range("b:b"), xlDescending 
End With 

Debug.Print Timer - st 

Application.ScreenUpdating = True 
End Sub 
+0

這是無法運行。你正在拆分你從未定義的'str',這樣就會導致錯誤。而且,你總是隻索引第一個子字符串'S(1)'而不是使用你定義的'j'的循環變量,所以除了每個句子中的第一個單詞之外,你永遠不會計算任何內容。 – Mikegrann

+0

對我來說,它根本不會產生任何輸出。儘管我對許多這些腳本不太熟悉,但仍然可以查明具體問題。感謝Mikegrann指出了潛在的問題。 –

+0

@Comintern它確實進行了編譯,因爲str在頂部被聲明爲Variant。這絕對是一種糟糕的形式,但至少它可以在發佈答案之前通過編譯器檢查。至少,VBE編譯並運行它(當然沒有輸出),就像我一樣。 – Mikegrann

0

您的函數需要很長時間才能運行,因爲您在Excel表格中單元格操作。此方法不會將任何數據拉入RAM內存(快速)。只需將感興趣的列插入到數組或列表中即可。以與您的功能相同的方式在列表上進行操作。這將大大加速它的運作。 例如,

Dim Whole_Sentence_List As New Collection 
Dim Word_Count_List As New Collection 
Dim Array_of_Words_List As New Collection 

Array_of_Words_List的是,你可以把句子的單詞變成一個個而不是3,4,5 ...... 100列數組的集合。玩一段時間的集合,以瞭解它們的工作方式