DFA建設克努特莫里斯普拉特算法

我這裏指的是克努特莫里斯 - 普拉特（KMP）算法在塞奇威克的書「算法」字符串搜索的輪廓（第四版）。DFA建設克努特莫里斯普拉特算法

的KMP算法使用基於確定性有限自動機（DFA）中字符串搜索備份。據我所知，DFA是如何進入的算法，但我不明白如何構建的 DFA，這是由下面的代碼片段完成：

dfa[pat.charAt(0)][0] = 1; 
for (int X = 0; j = 1; j< M; j++) { 
    for (int c = 0; c < R; c++) { 
     dfa[c][j] = dfa[c][X]; 
    } 
    dfa[pat.charAt(j)][j] = j+1; 
    X = dfa[pat.charAt(j)][X]; 
}

其中M是模式的長度pat和R字母表的大小。 charAt()函數返回相應位置上字符的整數值。

有人可以用什麼方式這段代碼構造一個DFA解釋一下嗎？我迷失在內循環的實際直觀意義中。

來源

2015-05-30 madison54

dfa是一個包含故障鏈接的狀態表。看到這個問題：http：//stackoverflow.com/questions/23260938/knuth-morris-pratt-fail-table？rq = 1 – Bytemain

讓我們來看看下面的足總模式ACACAGA。

enter image description here

上面圖代表圖案ACACAGA的圖形和表格表示。

在此，在DFA狀態的數量爲M + 1，其中M是該圖案的長度。構造DFA的主要內容是從當前狀態爲每個可能的字符獲取下一個狀態。給定一個字符x和一個狀態k，我們可以通過考慮字符串「pat [0..k-1] x」來得到下一個狀態，它基本上是模式字符pat [0]，pat 1 ... pat [k- 1]和字符x。這個想法是獲得給定模式的最長前綴的長度，使得該前綴也是「pat [0..k-1] x」的後綴（LPS）。長度值給我們下一個狀態。

例如，讓我們來看看如何從上面的圖中的當前狀態，5和字符「C」得到下一個狀態。我們需要考慮字符串「pat [0..5] C」，它是「ACACAC」。模式的最長前綴的長度使得前綴是「ACACAC」的後綴是4（「ACAC」）。因此，字符'C'的下一個狀態（來自狀態5）爲4。

// dfa[i][j] = k denotes the transition function will go k'th state 
// with character i from state j 

// DFA will go always (i + 1)'th state from i'th state 
//if the character c is equal to current character of pattern 
dfa[pat.charAt(0)][0] = 1; 

// here X is initialized with LPS (longest prefix suffix) 
// of pattern[0....j - 1]. LPS[0] is always 0 
for (int X = 0; j = 1; j< M; j++) { 
    for (int c = 0; c < R; c++) { // for all possible characters 
     // transition function from j'th state taking character c 
     // will go to the same state where it went from X(LPS)'th state 
     // taking character c (justify it with an example) 
     dfa[c][j] = dfa[c][X]; 
    } 
    // DFA will go always (i + 1)th state from i'th state if 
    // the character c is equal to current character of pattern 
    dfa[pat.charAt(j)][j] = j + 1; 
    X = dfa[pat.charAt(j)][X]; // update LPS 
}

來源

2015-05-30 16:15:53

DFA建設克努特莫里斯普拉特算法

回答

相關問題