2017-08-14 69 views
-1

我完全停留在這個錯誤之後,在線教程解釋了在Python中編寫OpenAI腳本的概念。完整的錯誤是:「UnboundLocalError:本地變量'獎勵''在assingment之前引用」在OpenAI Python腳本

TypeError: local variable 'reward_n' referenced before assignment 

這是我當前的代碼:

import gym 
import random 

#reinforcement learning step 
def determine_turn(turn, observation_n, j, total_sum, prev_total_sum, reward_n): 
    #for every 15 iterations, sum the total observations, and take the average 
    #if lower than 0, change the direction 
    #if we go 15+ iterations and get a reward each step, we're doing something right 
    #thats when we turn 
    if(j >= 15): 
     if(total_sum/ j) == 0: 
      turn = True 
     else: 
      turn = False 

     #reset vars 
     total_sum = 0 
     j = 0 
     prev_total_sum = total_sum 
     total_sum = 0 

    else: 
     turn = False 
    if(observation_n != None): 
     #increment counter and reward sum 
     j+=1 
     total_sum += reward_n 
    return(turn, j, total_sum, prev_total_sum) 

def main(): 

    #init environment 
    env = gym.make('CartPole-v0') 
    observation_n = env.reset() 

    #init variables 
    # num of game iterations 
    n = 0 
    j = 0 
    #sum of observations 
    total_sum = 0 
    prev_total_sum = 0 
    turn = False 

    #define our turns or keyboard actions 
    left = 0 
    right = 1 
    forward = env.action_space.sample() 

    #main logic 
    while True: 
     #increment a counter for number of iterations 
     n+=1 

     #if at least one iteration is made, check if turn is needed 
     if(n > 1): 
      #if at least one iteration, check if a turn 
      if(observation_n[0] != None): 
       #store the reward in the previous score 
       prev_score = reward_n[0] 

       #should we turn? 
       if(turn): 
        #pick a random event 
        #where to turn? 
        event = random.choice([left,right]) 
        #perform an action 
        action_n = [event for ob in observation_n] 
        #set turn to false 
        turn = False 

     elif(~turn): 
      #if no turn is needed, go straight 
      action_n = [forward for ob in observation_n] 


     #if there is an obseravtion, game has started, check if turn needed 
     if(observation_n[0] != None): 
      turn, j, total_sum, prev_total_sum = determine_turn(turn, observation_n[0], j, total_sum, prev_total_sum, reward_n[0]) 

     #save new variables for each iteration 
     observation_n, reward_n, done_n, info = env.step(action_n) 

     env.render() 

if __name__ == '__main__': 
    main() 

,我沒有得到這個錯誤的事情是,所有的變量都在main(),或在確定該走哪從main()獲取變量。到目前爲止,這是我對這個錯誤的知識,並會非常感謝你的幫助。

+0

不要在採取行動前期待獎勵。 – YvesgereY

回答

1
#if at least one iteration is made, check if turn is needed 
    if(n > 1): 
     #if at least one iteration, check if a turn 
     if(observation_n[0] != None): 
      #store the reward in the previous score 
      prev_score = reward_n[0] 

在你的代碼試圖訪問reward_n[0],但這個變量還沒有被初始化,所以這部分,你需要之前初始化它首先調用相同的方式,你這個做:

observation_n = env.reset() 
相關問題