MiniMax算法的混淆

所以我目前正在做一個關於Mancala和NIM組合遊戲的MiniMax算法的作業。該程序的工作方式是向用戶詢問板的當前狀態，並且該程序將假設吐出用戶應該採取的第一個舉動以贏得比賽。我只是困惑，我是否想用所有可能的解決方案生成整個遊戲樹，並且在葉節點首先具有效用函數，然後讓MiniMax算法遞歸運行，或者在MiniMax算法中創建樹？我很抱歉，如果這個問題很不清楚，但我只是堅持這個想法，我似乎無法理解它。MiniMax算法的混淆

來源

2017-10-12 Reed Sager

在實踐中：這棵樹是即時生成的。有一個重要的原因：你不會使用純粹的最小最大值，但一些alpha-beta類似修剪，因此可能不會搜索整個樹（重要的是：一個好的移動排序）。第二個原因：在大多數遊戲中，你將無法搜索所有狀態（無限深度）;因此使用迭代加深將搜索限制在某些固定的深度/層（在剩餘時間時增加） – sascha

遊戲樹不是顯式生成的，但僅執行遍歷它。永遠不要在minimax執行過程中將整個樹放在內存中。正如sascha所提到的，事情是即時完成的，因爲在任何節點（任何電路板配置）中，您都可以輕鬆生成其後繼狀態。這裏關鍵的一點是，當你在一個電路板配置上應用一個移動（從而獲得另一個電路板配置）時，你實際上會在這個概念性的遊戲樹中移動。 – qwertyman

編寫minimax函數的正確方法是通過製作和取消製作動作遍歷搜索樹。你一次只能存儲一個遊戲狀態，並且通過在該遊戲狀態下製作和取消製作動作來遍歷整個樹。如果這是令人困惑的，那麼看看一些minimax psudocode會有所幫助。請注意，有兩種常用的極大極小，常規極小極小和極小極大極變體。該psudeocode是極小，因爲它更intuituve但在實踐中我會推薦negamax變種，因爲它是非常簡單的：

int max(int depth){ 
if(this state is terminal){//won, lost, drawn, or desired search depth is reached 
    return value 
} 
//if the state is non terminal 
//we want to examine all child nodes. We do this by making all possible moves from this state, calling the min function 
//(all childs of max nodes are min nodes) and then unmaking the moves. 
int bestVal = -infinity; 
generate move list; 
for(all moves in move list){ 
    makeMove(this move in move list); 
    int val = min(depth -1); 
    unMakeMove(this move in move list); 
    bestVal = max(val,bestVal); 
} 
return bestVal;

}

int min(int depth){ 
    if(this state is terminal){//won, lost, drawn, or desired search depth is reached 
     return value 
    } 
    //if the state is non terminal 
    //we want to examine all child nodes. We do this by making all possible moves from this state, calling the max function 
    //(all childs of min nodes are max nodes) and then unmaking the moves. 
    int bestVal = +infinity; 
    generate move list; 
    for(all moves in move list){ 
     makeMove(this move in move list); 
     int val = min(depth -1); 
     unMakeMove(this move in move list); 
     bestVal = min(val,bestVal); 
    } 
    return bestVal; 
}

這樣，你通過保持一個軌道遍歷整個樹遊戲狀態以及在該遊戲狀態下遞歸製作和取消製作動作。一旦你理解了這個alpha beta修剪的外觀。另請注意，此功能僅返回最佳移動的值而非移動本身。你需要一個特殊的函數來跟蹤最好的移動，並且在根上調用。

來源

2017-10-15 01:43:46 chessprogrammer

MiniMax算法的混淆

回答

相關問題