此最佳匹配可以在O解決(| S | + | L | + k)的時間,其中k是從L的S.所有串的匹配的總數有兩個主要步驟:
運行Aho-Corasick。這會給你所有來自S的字符串的匹配。這與上面提到的相同。
初始化一個數組A,長度爲整數| S | +1全部爲零。3月通過陣列,在位置i我設置A [i]爲A [i-1](如果它較大),則對於每個匹配,M,從位於第i個位置的S中的L開始,將A [i + | M |]設置爲A [i + | M |]和A [i] + | M |的最大值。
下面是Go中的一些代碼,完全是這樣。它使用我寫的一個包裝,它有一個方便的包裝來呼叫Aho-Corasick。
package main
import (
"fmt"
"github.com/runningwild/stringz"
)
func main() {
input := []byte("thenuke")
patterns := [][]byte{[]byte("hen"), []byte("thenu"), []byte("uke")}
// This runs Aho-Corasick on the input string and patterns, it returns a
// map, matches, such that matches[i] is a list of indices into input where
// patterns[i] matches the input string. The running time of this is
// O(|input| + |patterns| + k) and uses O(|patterns| + k) auxillary storage,
// where k is the total number of matches found.
find := stringz.FindSet(patterns)
matches := find.In(input)
// We want to invert the map so that it maps from index to all strings that
// match at that index.
at_pos := make([][]int, len(input)+1)
for index, hits := range matches {
for _, hit := range hits {
at_pos[hit] = append(at_pos[hit], index)
}
}
// Now we do a single pass through the string, at every position we will
// keep track of how many characters in the input string we can match up to
// that point.
max := make([]int, len(input)+1)
for i := range max {
// If this position isn't as good as the previous position, then we'll use
// the previous position. It just means that there is a character that we
// were unable to match anything to.
if i > 0 && max[i-1] > max[i] {
max[i] = max[i-1]
}
// Look through any string that matches at this position, if its length is
// L, then L positions later in the string we can have matched L more
// character if we use this string now. This might mean that we don't use
// another string that earlier we thought we'd be matching right now, we'll
// find out later which one was better.
for _, hit := range at_pos[i] {
L := len(patterns[hit])
if i+L < len(max) && max[i+L] < max[i]+L {
max[i+L] = max[i] + L
}
}
}
fmt.Printf("%v\n", max)
}
爲每一次嘗試保留一個計數器,然後找到最佳匹配?就像運行''''一樣,如果你能找到'nuke',返回2,然後運行'then'並立即返回1,以便立即檢查1是否較大,顯然不是並且保持前一個是最大值。 ..並繼續做下去,直到你完成! –