如何使用Keras TensorBoard回調進行網格搜索

我正在使用Keras TensorBoard回調。我想運行網格搜索和可視化張量板中每個單一模型的結果。的問題是，不同運行的所有結果都合併在一起，損失的情節是這樣的一個爛攤子：如何使用Keras TensorBoard回調進行網格搜索

如何重命名每次運行有類似這樣的東西：

這裏網格搜索代碼：

df = pd.read_csv('data/prepared_example.csv') 

df = time_series.create_index(df, datetime_index='DATE', other_index_list=['ITEM', 'AREA']) 

target = ['D'] 
attributes = ['S', 'C', 'D-10','D-9', 'D-8', 'D-7', 'D-6', 'D-5', 'D-4', 
     'D-3', 'D-2', 'D-1'] 

input_dim = len(attributes) 
output_dim = len(target) 

x = df[attributes] 
y = df[target] 

param_grid = {'epochs': [10, 20, 50], 
       'batch_size': [10], 
       'neurons': [[10, 10, 10]], 
       'dropout': [[0.0, 0.0], [0.2, 0.2]], 
       'lr': [0.1]} 

estimator = KerasRegressor(build_fn=create_3_layers_model, 
          input_dim=input_dim, output_dim=output_dim) 


tbCallBack = TensorBoard(log_dir='./Graph', histogram_freq=0, write_graph=True, write_images=False) 

grid = GridSearchCV(estimator=estimator, param_grid=param_grid, n_jobs=-1, scoring=bug_fix_score, 
          cv=3, verbose=0, fit_params={'callbacks': [tbCallBack]}) 

grid_result = grid.fit(x.as_matrix(), y.as_matrix())

來源

2017-08-02 paolof89

我不認爲有任何的方式來傳遞「每跑」參數GridSearchCV。也許最簡單的方法是將子類KerasRegressor做你想做的事。

class KerasRegressorTB(KerasRegressor): 

    def __init__(self, *args, **kwargs): 
     super(KerasRegressorTB, self).__init__(*args, **kwargs) 

    def fit(self, x, y, log_dir=None, **kwargs): 
     cbs = None 
     if log_dir is not None: 
      params = self.get_params() 
      conf = ",".join("{}={}".format(k, params[k]) 
          for k in sorted(params)) 
      conf_dir = os.path.join(log_dir, conf) 
      cbs = [TensorBoard(log_dir=conf_dir, histogram_freq=0, 
           write_graph=True, write_images=False)] 
     super(KerasRegressorTB, self).fit(x, y, callbacks=cbs, **kwargs)

你會用它喜歡：

# ... 

estimator = KerasRegressorTB(build_fn=create_3_layers_model, 
          input_dim=input_dim, output_dim=output_dim) 

#... 

grid = GridSearchCV(estimator=estimator, param_grid=param_grid, 
n_jobs=1, scoring=bug_fix_score, 
        cv=2, verbose=0, fit_params={'log_dir': './Graph'}) 

grid_result = grid.fit(x.as_matrix(), y.as_matrix())

更新：

由於GridSearchCV運行相同的模型（即參數相同的配置）不止一次由於交叉驗證，之前的代碼最終會在每次運行中放置多個跟蹤。查看源代碼（here和here），似乎沒有辦法檢索「當前拆分ID」。同時，您不應該只檢查現有文件夾並根據需要添加子項，因爲這些作業並行運行（至少可能，但我不確定Keras/TF是否如此）。你可以嘗試這樣的事情：

import itertools 
import os 

class KerasRegressorTB(KerasRegressor): 

    def __init__(self, *args, **kwargs): 
     super(KerasRegressorTB, self).__init__(*args, **kwargs) 

    def fit(self, x, y, log_dir=None, **kwargs): 
     cbs = None 
     if log_dir is not None: 
      # Make sure the base log directory exists 
      try: 
       os.makedirs(log_dir) 
      except OSError: 
       pass 
      params = self.get_params() 
      conf = ",".join("{}={}".format(k, params[k]) 
          for k in sorted(params)) 
      conf_dir_base = os.path.join(log_dir, conf) 
      # Find a new directory to place the logs 
      for i in itertools.count(): 
       try: 
        conf_dir = "{}_split-{}".format(conf_dir_base, i) 
        os.makedirs(conf_dir) 
        break 
       except OSError: 
        pass 
      cbs = [TensorBoard(log_dir=conf_dir, histogram_freq=0, 
           write_graph=True, write_images=False)] 
     super(KerasRegressorTB, self).fit(x, y, callbacks=cbs, **kwargs)

我使用的Python 2兼容性os呼叫，但如果你正在使用Python 3，你可以考慮爲路徑和目錄處理的更好pathlib module。

注：我忘了前面提到它，但爲了以防萬一，注意，通過write_graph=True將記錄曲線圖每次運行，這取決於你的模型，可能意味着很多這樣的空間（相對而言）。這同樣適用於write_images，但我不知道該功能需要的空間。

來源

2017-08-02 09:01:17 jdehesa

感謝您的詳細建議。今天晚些時候我會試一試，我會告訴你。只需要考慮一下：此解決方案是否創建了多個文件夾？在那種情況下，我能夠在單個張量板上顯示所有運行，還是需要運行它的多個實例？ – paolof89

@ paolof89是的，它確實爲每個實驗創建一個目錄，但事實上，您在TensorBoard中看到的「運行」實際上只是具有日誌信息的子文件夾。如果您在日誌的根目錄中打開TensorBoard（在「。/ Graph」示例中），則每個實驗都會看到一個「運行」，所有這些都會在一起，或者您可以在特定運行的目錄中打開TensorBoard，仔細看看。 – jdehesa

我測試過了，它可以工作，但最後還有一個問題。 GridSearchCV實現了k-fold tecnique，因此在每個文件夾中都可以找到k圖。最小k倍值是2，所以我的問題還沒有解決。有關它的任何想法？ – paolof89

如何使用Keras TensorBoard回調進行網格搜索

回答

相關問題