構建字符串圖（Levenshtein距離）

當前，在我的計算機科學課程中，我們正在討論圖以及如何使用圖找出最短距離。大約一週前，我收到了一份作業，老師給我們提供了使用整數的圖形代碼，我們必須調整它才能使用單詞列表計算Levenshtein距離。我遇到的問題是，我不明白圖表如何處理足夠的操作。我已經嘗試過使用谷歌搜索圖表，但沒有發現類似於我給出的程序類型。構建字符串圖（Levenshtein距離）

我們剛剛在鏈表上完成了一個單元，並且我認爲圖的操作方式相似嗎？我知道每個節點都會指向許多其他節點，但是在我有2000個字都指向對方的情況下，如何在沒有在結構中聲明多個節點的情況下跟蹤每個節點2000個指針？我相信（不是100％），在我被授予的課程中，我的老師使用了一個整數向量向量來跟蹤，但我不知道如何實現。

我不是要求任何人充分評論每一行，因爲這是一項巨大的工作，但如果有人可以粗略地解釋我將如何完成我上面提到的問題，也許閱讀代碼並給我一個粗略的理解部分章節意味着什麼（我將對某些章節發表評論，我特別不理解）我將非常感激。

這是我們得到的代碼：

#include <iostream> 
#include <vector> 
#include <algorithm> //for max<> 
#include <limits> 

using namespace std; 

typedef vector <int> ivec; 
typedef vector <ivec> imatrix; //A vector of vectors, not how this works or how to implement 
typedef vector <bool> bvec; 

struct graph 
{ 
    imatrix edges; //list of attached vertices for each node 
    int numVertices; 
}; 

//I understand the ostream overloading 
ostream & operator << (ostream & stream, ivec &vec) 
{ 
    for (int i = 0; i < vec.size(); i++) 
    { 
     stream << vec[i] << " "; 
    } 
    return stream; 
} 

ostream & operator << (ostream & stream, graph &g) 
{ 
    stream << endl << "numVert = " << g.numVertices << endl; 
    for (int i = 0; i < g.numVertices; i++) 
    { 
     stream << "vertex = " << i+1 << " | edges = " << g.edges[i] << endl; 
    } 
    return stream; 
} 

const int sentinel = -1; 

bvec inTree; 
ivec distanceNodes; 
ivec parents; 

void initGraph(graph * g); 
void insertEdge(graph * g, int nodeNum, int edgeNum); 
void initSearch(graph * g); 
void shortestPath(graph * g, int start, int end); 

int main() 
{ 
    //I understand the main, the two numbers in insertEdge are being hooked together and the two numbers in shortestPath are what we are looking to connect in the shortest way possible 
    graph g; 
    initGraph(&g); 
    insertEdge(&g, 1, 2); 
    insertEdge(&g, 1, 3); 
    insertEdge(&g, 2, 1); 
    insertEdge(&g, 2, 3); 
    insertEdge(&g, 2, 4); 
    insertEdge(&g, 3, 1); 
    insertEdge(&g, 3, 2); 
    insertEdge(&g, 3, 4); 
    insertEdge(&g, 4, 2); 
    insertEdge(&g, 4, 3); 
    insertEdge(&g, 4, 5); 
    insertEdge(&g, 5, 4); 
    insertEdge(&g, 6, 7); 
    insertEdge(&g, 7, 6); 
    cout << "The graph is " << g << endl; 
    shortestPath(&g, 1, 5); 
    shortestPath(&g, 2, 4); 
    shortestPath(&g, 5, 2); 
    shortestPath(&g, 1, 7); 
    return 0; 
} 

void initGraph(graph * g) 
{ 
    g -> numVertices = 0; //Why set the number of vertices to 0? 
} 

void insertEdge(graph * g, int nodeNum, int edgeNum) 
{ 
    int numVertices = max(nodeNum, edgeNum); //Max finds the larger of two numbers I believe? How can this be used with strings, one is not bigger than the other 
    numVertices = max(1, numVertices); 
    if (numVertices > g->numVertices) 
    { 
     for (int i = g->numVertices; i <= numVertices; i++) 
     { 
      ivec nodes; 
      if (g->edges.size() < i) 
      { 
       g -> edges.push_back(nodes); 
      } 
     } 
     g->numVertices = numVertices; 
    } 
    g->edges[nodeNum - 1].push_back(edgeNum); 
} 

void initSearch(graph * g) //I believe this function simply resets the values from a previous search 
{ 
    if (g == NULL) 
    { 
     return; 
    } 
    inTree.clear(); 
    distanceNodes.clear(); 
    parents.clear(); 
    for (int i = 0; i <= g->numVertices; i++) 
    { 
     inTree.push_back(false); 
     distanceNodes.push_back(numeric_limits <int> :: max()); 
     parents.push_back(sentinel); 
    } 
} 

void shortestPath(graph * g, int start, int end) 
{ 
    //Very confused about how this function works 
    initSearch(g); 
    int edge; 
    int curr; //current node 
    int dist; 
    distanceNodes[start] = 0; 
    curr = start; 
    while (! inTree[curr]) 
    { 
     inTree[curr] = true; 
     ivec edges = g->edges[curr - 1]; 
     for (int i = 0; i < edges.size(); i++) 
     { 
      edge = edges[i]; 

      if (distanceNodes[edge] > distanceNodes[curr] + 1) 
      { 
       distanceNodes[edge] = distanceNodes[curr] + 1; 
       parents[edge] = curr; 
      } 
     } 
     curr = 1; 
     dist = numeric_limits <int> :: max(); 
     for (int i = 1; i <= g->numVertices; i++) 
     { 
      if ((!inTree[i]) && (dist > distanceNodes[i])) 
      { 
       dist = distanceNodes[i]; 
       curr = i; 
      } 
     } 
    } 
    ivec path; 
    if (distanceNodes[end] == numeric_limits <int> :: max()) //is there a numeric_limits <string> :: max? 
    { 
     cout << "No way from " << start << " to " << end << endl; 
    } 
    else 
    { 
     int temp = end; 
     while (temp != start) 
     { 
      path.push_back(temp); 
      temp = parents[temp]; 
     } 
     path.push_back(start); 
     reverse(path.begin(), path.end()); 
     cout << "From " << start << " to " << end << " is " << path << endl; 
    } 
}

如果你能幫忙，那將是最歡迎的，因爲我很可能會與圖表的詳細項目，我掙扎，由於不理解他們。

謝謝特里斯坦

來源

2014-05-14 Tristan

一些答案：

問：你問爲什麼numVertices在下面設置爲0：在g的聲明

void initGraph(graph * g) 
{ 
    g -> numVertices = 0; //Why set the number of vertices to 0? 
}

A.看 - 它默認初始化爲：

int main() 
{ 
    graph g; 
    .... 
}

現在看defi圖表的名稱 - 它沒有構造函數：

struct graph 
{ 
    imatrix edges; //list of attached vertices for each node 
    int numVertices; 
};

因此，默認情況下邊緣會被正確初始化，因爲矢量具有構造函數。但是numVertices是一個原始類型，所以它將包含任何隨機值發生在該內存位置 - 這意味着它需要手動初始化。這就是爲什麼initGraph不需要初始化邊緣，但它確實需要初始化numVertices。

問：你問如何可以找到大兩個標準::串知道MAX（）返回兩個整數中較大的：

int numVertices = max(nodeNum, edgeNum); //Max finds the larger of two numbers I believe? How can this be used with strings, one is not bigger than the other

A.根據http://www.cplusplus.com/reference/algorithm/max/ Max使用「功能用途運算符<（或comp，如果提供）來比較這些值。「但std :: strings可以使用<運算符進行比較，所以確實沒有問題。

問：你問到向量的向量：

typedef vector <int> ivec; 
typedef vector <ivec> imatrix; //A vector of vectors, not how this works or how to implement

A.你可以用[]訪問一個矢量，所以如果你有一個名爲x的imatrix類型的變量，你可以說x [0]這將返回一個ivec（因爲這是存儲在一個imatrix矢量中的對象的類型，所以如果你說x [0] [0]，將返回存儲在由x返回IVEC的第一個整數[0]要改變它使用一個字符串只是說：

typedef vector <std::string> ivec; 
typedef vector <ivec> imatrix;

你也可以重命名變量，如果你通緝。

您還需要#include <string>

來源

2014-05-14 02:18:35

謝謝，這是非常有幫助的！我有一個後續問題：你說max可以用來比較沒有問題的字符串，你知道一個字符串比另一個字符串「大」的標準嗎？它是基於長度還是字母值或別的東西？謝謝 – Tristan

typedef vector <ivec> imatrix; //A vector of vectors, not how this works or how to implement

這裏圖表被表示爲Adjacency Matrix。您還可以使用Adjacency List來表示一個圖形，其中每個節點將包含相鄰節點的數組/鏈接列表。

g -> numVertices = 0; //Why set the number of vertices to 0?

它初始化圖形，在啓動時頂點/節點數量爲零。當使用insertEdge方法插入邊和節點時，該號碼將被更新。

int numVertices = max(nodeNum, edgeNum); //Max finds the larger of two numbers I believe? How can this be used with strings, one is not bigger than the other

你雖然沒有發佈完整的代碼，我認爲最大的值用於插入邊緣之前添加所需數量的頂點。

ivec nodes; 
if (g->edges.size() < i) 
{ 
    g -> edges.push_back(nodes); 
}

上面的代碼插入新的頂點。您可能會在此處爲您的版本執行integer comparison，而不是string，字符串是節點的數據，而不是節點的編號。如果你需要字符串比較，C++已經爲此重載了運算符。

關於initSearch和最短路徑方法，在這裏後者發現使用的算法（我不知道是哪個，你可以搜索）節點之間的最短路徑，並尋找最短路徑，前者方法初始化之前將用於搜索的值。例如，它可以將每對節點之間的距離設置爲infinity最初，當它們之間找到路徑時，它將被更新。

來源

2014-05-14 02:32:08 Rakib

所以，就我的理解而言，鄰接矩陣就像一個網格，其中每個節點在圖中都有一排，並且每個節點可能具有的邊的列都是連接，連接用1表示，非連接用0表示。它是否正確？ – Tristan

構建字符串圖（Levenshtein距離）

回答

相關問題