2016-11-03 41 views
1

我知道這不是本網站的理想問題,但根據社區wiki(這裏:https://meta.stackexchange.com/questions/129598/which-computer-science-programming-stack-exchange-do-i-post-in)概述的準則,我覺得它符合算法。如果位置不合適,請註明移動位置,或者發表評論,我會適當刪除。通過最短前綴的組號碼

我有一個數字列表,我需要通過儘可能短的共同起始數字進行分組。

在所有的數字下面的例子可以通過12以後內容將通過公司A所擁有分組,12:

120 CompanyA 
121 CompanyA 
122 CompanyA 
123 CompanyA 
124 CompanyA 
125 CompanyA 
126 CompanyA 
127 CompanyA 
128 CompanyA 
129 CompanyA 

給我的數據更真實的樣本(數字是3和6之間位):

3734 CompanyA 
3735 CompanyA 
375 CompanyA 
3760 CompanyA 
3761 CompanyA 
3762 CompanyA 
3763 CompanyA 
3764 CompanyA 
3765 CompanyA 
3766 CompanyA 
3767 CompanyA 
3768 CompanyA 
3769 CompanyA 
3770 CompanyA 
3771 CompanyA 
3773 CompanyB 
3774 CompanyB 
3775 CompanyB 
3776 CompanyB 
3778 CompanyB 
33045 CompanyB 
361 CompanyB 

應該改爲:

3734 CompanyA 
3735 CompanyA 
375 CompanyA 
376 CompanyA 'All numbers from 3760 to 3769 have been condensed to 1 number 
3770 CompanyA 
3771 CompanyA 
3773 CompanyB 
3774 CompanyB 
3775 CompanyB 
3776 CompanyB 
3778 CompanyB 
33045 CompanyB 
361 CompanyB 

這是一個問題,必須滲透到多個行業,我想有一種算法,我可以適應VBA沒有太多的困難。然而,我正在爲邏輯而掙扎。

如果有人能指出我在這個正確的方向,這將不勝感激。如果有人能指出我正確的方向,我會很高興地適應和發佈VBA的答案,但不幸的是,我的搜索功能讓我失望。

+0

那我相信你會找到的東西,你可以複製和粘貼。很可能你需要自己創建一個解決方案。 – Tomalak

+0

是否有邏輯,多麼深的分組他們應該還是應該去申請?例如,爲什麼不「公司3734」和「3735公司」分組踏踏實實地「373公司」。另外,你怎麼知道一個數字代表多少個數字?難道是一直被認爲是填充爲9與0右側的長度可見的數字? – Blackhawk

+0

@Blackhawk數字可以達到8位,而第三位的4位數字爲包括一個它必須涵蓋所有排列,即它必須是明確表示,公司旗下擁有公司3730-3739分組踏踏實實地373如果這有意義? – User632716

回答

2

所以花了一點時間比我預料的,但在這裏!如果您之前沒有與Tries一起工作,我建議reading the Wikipedia article。基本上,樹中的每個級別代表數字的一個字符。當數字的末尾向下通過樹時,它是一片葉子,這是存儲值(公司名稱)的位置。無可否認,我在評論代碼方面做得很差,所以如果有什麼特別的你想知道,評論和我可以擴展它。

首先,創建一個clsTrieNode類,像這樣:

Option Explicit 

Public parent As clsTrieNode 
Public value As String 
Public count As Long 
Public digit As String 

'Arrays are not allowed to be public members of classes, sadly 
Private m_children(0 To 9) As clsTrieNode 

Public Property Get children(i As Byte) As clsTrieNode 
    Set children = m_children(i) 
End Property 

Public Property Set children(i As Byte, node As clsTrieNode) 
    Set m_children(i) = node 
End Property 

接下來,創建clsNumberTrie像這樣:

Option Explicit 

Private head As clsTrieNode 

Private Sub Class_Initialize() 
    Set head = New clsTrieNode 
End Sub 

Public Sub Add(key As String, value As String) 
    Dim temp As clsTrieNode 
    Set temp = head 
    Dim i As Long 
    Dim key_digit As Byte 
    For i = 1 To Len(key) 
     key_digit = Val(Mid(key, i, 1)) 
     If Not temp.children(key_digit) Is Nothing Then 
      Set temp = temp.children(key_digit) 
     Else 
      Set temp.children(key_digit) = New clsTrieNode 
      Set temp.children(key_digit).parent = temp 
      Set temp = temp.children(key_digit) 
      temp.digit = key_digit 'implicit string conversion 
     End If 
    Next 
    temp.value = value 
    mergeTrieUpwards temp.parent 
End Sub 

Private Sub mergeTrieUpwards(node As clsTrieNode) 
    If isMergeable(node) Then 
     node.value = node.children(0).value 
     Dim i As Byte 
     For i = 0 To 9 
      Set node.children(i) = Nothing 
     Next 
     mergeTrieUpwards node.parent 
    End If 
End Sub 

Private Function isMergeable(node As clsTrieNode) As Boolean 
    Dim i As Byte 
    'Firstly, node must be defined (e.g., not the parent of head) 
    If node Is Nothing Then 
     isMergeable = False 
     Exit Function 
    End If 

    For i = 0 To 9 
     'Secondly, all children must be defined 
     If node.children(i) Is Nothing Then 
      isMergeable = False 
      Exit Function 
     'Thirdly, all children must be leaves 
     ElseIf node.children(i).value = "" Then 
      isMergeable = False 
      Exit Function 
     End If 
    Next 
    isMergeable = True 
End Function 

Public Function toString() As String 
    Dim strKey As String 
    Dim strOutput As String 
    If Not head Is Nothing Then 
     strOutput = toStringRecurse("", head) 
    End If 
    toString = strOutput 
End Function 

Private Function toStringRecurse(prefix As String, node As clsTrieNode) As String 
    Dim strOutput As String 
    Dim i As Byte 
    If node.value <> "" Then 
     toStringRecurse = prefix & node.digit & " " & node.value & vbCrLf 
     Exit Function 
    Else 
     For i = 0 To 9 
      If Not node.children(i) Is Nothing Then 
       strOutput = strOutput & toStringRecurse(prefix & node.digit, node.children(i)) 
      End If 
     Next 
     toStringRecurse = strOutput 
    End If 
End Function 

最後,運行它針對你輸入數字,我有在以下一個名爲mdlMain的模塊。我推出了我自己的Split,因爲內置的Split不支持忽略連續的分隔符,並且您的輸入具有可變數量的空格。

Public Sub Main() 
    Dim input_data As String 
    input_data = "3734 CompanyA" & vbCrLf & _ 
       "3735 CompanyA" & vbCrLf & _ 
       "375 CompanyA" & vbCrLf & _ 
       "3760 CompanyA" & vbCrLf & _ 
       "3761 CompanyA" & vbCrLf & _ 
       "3762 CompanyA" & vbCrLf & _ 
       "3763 CompanyA" & vbCrLf & _ 
       "3764 CompanyA" & vbCrLf & _ 
       "3765 CompanyA" & vbCrLf & _ 
       "3766 CompanyA" & vbCrLf & _ 
       "3767 CompanyA" & vbCrLf & _ 
       "3768 CompanyA" & vbCrLf & _ 
       "3769 CompanyA" & vbCrLf & _ 
       "3770 CompanyA" & vbCrLf & _ 
       "3771 CompanyA" & vbCrLf & _ 
       "3773 CompanyB" & vbCrLf & _ 
       "3774 CompanyB" & vbCrLf & _ 
       "3775 CompanyB" & vbCrLf & _ 
       "3776 CompanyB" & vbCrLf & _ 
       "3778 CompanyB" & vbCrLf & _ 
       "33045 CompanyB" & vbCrLf & _ 
       "361 CompanyB" 

    Dim companyTrie As clsNumberTrie 
    Set companyTrie = New clsNumberTrie 

    Dim rows As Variant 
    Dim row As Variant 

    rows = SplitStr(input_data, vbCrLf) 

    Dim i As Long 
    For i = 0 To UBound(rows) 
     row = SplitStr(CStr(rows(i)), " ", True) 
     companyTrie.Add CStr(row(0)), CStr(row(1)) 
    Next 

    Debug.Print companyTrie.toString 

End Sub 

'This implementation of split has supports ignoring consecutive delimiters 
Public Function SplitStr(str As String, delim As String, Optional treatSuccessiveDelimitersAsOne = False) As Variant 
    'This is not an optimal implementation: 
    '1. Resizing an array is expensive because it requires copying the whole thing. 
    '2. String concatenation has the same problem; new memory is allocated to hold the result, and then both strings are copied to this new location. 
    'Thankfully, with the small strings in this example, it doesn't matter too much. 

    Dim i As Long 
    Dim outArr() As String 

    ReDim outArr(0 To 0) 

    'Iterate through the string 
    For i = 1 To Len(str) 
     'If the current character is the start of the delimiter... 
     If Mid(str, i, 1) = Mid(delim, 1, 1) Then 
      'Check and see if the whole delimiter is there... 
      If isSubstringDelim(str, i, delim) Then 
       'If so, jump i past the delimiter and add a new slot to the split array 
       i = i + Len(delim) 
       ReDim Preserve outArr(0 To (UBound(outArr) + 1)) 
       'Check to see if there are multiple delimiters in a row... 
       While isSubstringDelim(str, i, delim) 
        i = i + Len(delim) 
        'If treatSuccessiveDelimitersAsOne is False, we add a blank string to the split array each time we encounter a successive delimiter. 
        'If it's true, just consume the delimiters without creating a blank string 
        If Not treatSuccessiveDelimitersAsOne Then 
         ReDim Preserve outArr(0 To (UBound(outArr) + 1)) 
        End If 
       Wend 
      End If 
     End If 
     'Add the current character to the current slot of the split array 
     outArr(UBound(outArr)) = outArr(UBound(outArr)) + Mid(str, i, 1) 
    Next 

    SplitStr = outArr 
End Function 

Private Function isSubstringDelim(str, index, delim) As Boolean 
    Dim min As Long 
    If (Len(str) - index) < Len(delim) Then 
     isSubstringDelim = False 
     Exit Function 
    End If 
    For i = 1 To Len(delim) 
     If Not (Mid(str, i + index - 1, 1) = Mid(delim, i, 1)) Then 
      isSubstringDelim = False 
      Exit Function 
     End If 
    Next 
    isSubstringDelim = True 
End Function 

由於訪問節點的方式,結果以字母順序輸出。需要注意的是它支持遞歸分組,因此,如果您通過3358當初公司爲3351,但你也33591已通過33599的公司,它會挽起3359,然後再彙總335

33045 361 CompanyB CompanyB


3735 3734公司公司公司

375 376公司

3770公司公司3771 3773 CompanyB

CompanyB
CompanyB 3775 3776 3778 CompanyB
CompanyB

+0

這看起來像一個很好的迴應,我爲你把在 - 這就是它遠遠超出了我的預期的時間和精力表示感謝。沒有我有時間去通過這一點,我將在今晚的目標,更可能接受的答案。再次感謝你 – User632716

0

你可以測試字符串中字符的位置,所以如果你測試'37'並且它出現在第一個位置,你的刺就從37開始,你可以把它添加到你的列表中,移動它,你想要做的。

If InStr(yourString,"37") < 2 Then 
    'do whatever 
End If 

您可能需要玩弄確切的if語句和數字,這只是爲了向您展示大致的想法。

對不起,只是讀到底部,看到一些從37開始,但是是一個不同的公司。對於那些我會以同樣的方式在嵌套if中測試第三個字符,並將它們分開。