2012-08-16 72 views
2

我試圖在PyBrain中實現類似於迷宮問題的東西。但是,它與帶有緊急出口的房間更類似,您可以在其中一個房間留下代理以找到出口。 要將此轉換爲計算機方法,可以將雙向圖形與顯示房間之間路徑的權重一起使用。PyBrain增強學習 - 迷宮和圖形

我試圖實現一個新的環境,但我有點失落什麼應該是什麼。 例如,基於所述抽象環境I類想到這一點:

#!/usr/bin/python2.7 

class RoomEnv(Environment): 
    # number of action values acceptable by the environment 
    # Two events: go forward and go back through the door (but, how we know what room is connect to another?) 
    indim = 2 
    # Maybe a matrix where 0 is no connection and 1 is a connection(?) 
    #   A,B,C,D,E,F 
    #indim = array([[0,0,0,0,0,0], # A 
        [0,0,0,0,0,1], # B 
        [0,0,0,0,0,0], # C 
        [0,0,0,0,0,0], # D 
        [0,0,0,0,0,1], # E 
        [0,0,0,0,0,1], # F 
        ]) 

    # the number of sensors is the number of the rooms 
    outdim = 6 

    def getSensors(self): 
     # Initial state: 
     # Could be any room, maybe something random(?) 

    def performAction(self, action): 
     # We should look at all the states possible to learn what are the best option to go to the outside state. 
     # Maybe a for loop that goes through all the paths and use some weight to know where is the best option? 

     print "Action performed: ", action 

    def reset(self): 
     #Most environments will implement this optional method that allows for reinitialization. 

此致

回答

1

pybrain,可以定義房間作爲數組然後傳遞結構的迷宮創造一個新的環境。例如:

structure = array([[1, 1, 1, 1, 1, 1, 1, 1, 1], 
        [1, 0, 0, 1, 0, 0, 0, 0, 1], 
        [1, 0, 0, 1, 0, 0, 1, 0, 1], 
        [1, 0, 0, 1, 0, 0, 1, 0, 1], 
        [1, 0, 0, 1, 0, 1, 1, 0, 1], 
        [1, 0, 0, 0, 0, 0, 1, 0, 1], 
        [1, 1, 1, 1, 1, 1, 1, 0, 1], 
        [1, 0, 0, 0, 0, 0, 0, 0, 1], 
        [1, 1, 1, 1, 1, 1, 1, 1, 1]]) 

# defining the environment 
environment = Maze(structure, (7, 7)) 

在上面的示例中,1表示牆和0表示代理可以在其上行走的網格。所以你可以修改結構來製作你自己的。