2017-03-09 22 views
1

我想直接從他們的語言理解教程來培訓CNTK模型。CNTK培訓每個時代後速度放慢

Sequential([ 
      Embedding(emb_dim), 
      OneWordWindow(), 
      BatchNormalization(), 
      BiRecurrence(LSTM(hidden_dim), LSTM(hidden_dim)), 
      BatchNormalization(), 
      Dense(num_labels) 
     ]) 

似乎在每個紀元(見下文)後訓練速度會減慢。這是因爲學習率計劃,還是我在這裏錯過了一些東西?

的LR時間表亞當,是

lr_per_sample = [0.003]*4+[0.0015]*24+[0.0003] 
lr_per_minibatch = [x * minibatch_size for x in lr_per_sample] 
lr_schedule = learning_rate_schedule(lr_per_minibatch, UnitType.minibatch, epoch_size) 



Finished Epoch[1 of 1000]: [Training] loss = 0.149485 * 18059, metric = 3.46% * 18059 10.189s (1772.3 samples per second); 

Finished Epoch[2 of 1000]: [Training] loss = 0.071990 * 17974, metric = 1.47% * 17974 51.836s (346.7 samples per second); 

Finished Epoch[3 of 1000]: [Training] loss = 0.106882 * 17992, metric = 2.08% * 17992 60.175s (299.0 samples per second); 

Finished Epoch[4 of 1000]: [Training] loss = 0.074046 * 17987, metric = 1.51% * 17987 68.655s (262.0 samples per second); 

Finished Epoch[5 of 1000]: [Training] loss = 0.052539 * 17995, metric = 1.28% * 17995 77.627s (231.8 samples per second); 

Finished Epoch[6 of 1000]: [Training] loss = 0.057482 * 18011, metric = 1.55% * 18011 86.191s (209.0 samples per second); 

回答

1

有影響每秒的樣本數的打印輸出在ProgreessPrinter發現了一個錯誤。實際速度不受影響,只是報告速度。這個錯誤在master中解決 - 所以你現在可以得到它,或者你可以等待下一個正式發佈,這個發佈將在2017年3月14日發佈。

+0

謝謝,這些時代似乎並沒有真正放慢速度,但我不確定它的含義。 – budha