我面臨一個應該用Aho-Corasick自動機解決的問題。我給了一組單詞(由'0'或'1'組成) - 模式,我必須決定是否可以創建無限文本,它不包含任何給定的模式。我認爲,解決方案是創建Aho-Corasick自動機並搜索一個沒有匹配狀態的週期,但我無法提出一個好的方法來做到這一點。我想過使用DFS搜索狀態圖,但我不確定它是否會起作用,並且我有一個實現問題 - 讓我們假設,我們處於一個'1'邊的狀態 - 但狀態指出邊緣被標記爲匹配 - 因此我們不能使用該邊緣,我們可以嘗試失敗鏈接(當前狀態不具有'0'邊緣) - 但是我們還必須記住,我們不能從指向狀態的'1'邊緣通過當前的失敗鏈接。Aho-Corasick自動機的尋找週期
任何人都可以糾正我,告訴我該怎麼做?我用C++編寫了Aho-Corasick,我確信它的工作原理 - 我也理解整個算法。
這是基礎代碼:
class AhoCorasick
{
static const int ALPHABET_SIZE = 2;
struct State
{
State* edge[ALPHABET_SIZE];
State* fail;
State* longestMatchingSuffix;
//Vector used to remember which pattern matches in this state.
vector<int> matching;
short color;
State()
{
for(int i = 0; i < ALPHABET_SIZE; ++i)
edge[i] = 0;
color = 0;
}
~State()
{
for(int i = 0; i < ALPHABET_SIZE; ++i)
{
delete edge[i];
}
}
};
private:
State root;
vector<int> lenOfPattern;
bool isFailComputed;
//Helper function used to traverse state graph.
State* move(State* curr, char letter)
{
while(curr != &root && curr->edge[letter] == 0)
{
curr = curr->fail;
}
if(curr->edge[letter] != 0)
curr = curr->edge[letter];
return curr;
}
//Function which computes fail links and longestMatchingSuffix.
void computeFailLink()
{
queue< State* > Q;
root.fail = root.longestMatchingSuffix = 0;
for(int i = 0; i < ALPHABET_SIZE; ++i)
{
if(root.edge[i] != 0)
{
Q.push(root.edge[i]);
root.edge[i]->fail = &root;
}
}
while(!Q.empty())
{
State* curr = Q.front();
Q.pop();
if(!curr->fail->matching.empty())
{
curr->longestMatchingSuffix = curr->fail;
}
else
{
curr->longestMatchingSuffix = curr->fail->longestMatchingSuffix;
}
for(int i = 0; i < ALPHABET_SIZE; ++i)
{
if(curr->edge[i] != 0)
{
Q.push(curr->edge[i]);
State* state = curr->fail;
state = move(state, i);
curr->edge[i]->fail = state;
}
}
}
isFailComputed = true;
}
public:
AhoCorasick()
{
isFailComputed = false;
}
//Add pattern to automaton.
//pattern - pointer to pattern, which will be added
//fun - function which will be used to transform character to 0-based index.
void addPattern(const char* const pattern, int (*fun) (const char *))
{
isFailComputed = false;
int len = strlen(pattern);
State* curr = &root;
const char* pat = pattern;
for(; *pat; ++pat)
{
char tmpPat = fun(pat);
if(curr->edge[tmpPat] == 0)
{
curr = curr->edge[tmpPat] = new State;
}
else
{
curr = curr->edge[tmpPat];
}
}
lenOfPattern.push_back(len);
curr->matching.push_back(lenOfPattern.size() - 1);
}
};
int alphabet01(const char * c)
{
return *c - '0';
}
請張貼你的代碼草圖,建立自動機(請不要完整的代碼 - 只是關鍵部分):它可能很容易修改代碼來尋找循環,因爲它會建立自動機。 – anatolyg 2013-03-03 18:50:00