我正在訓練一個seq2seq模型,因爲在seq2seq模型的默認設置下,自定義的平行語料庫包含大約一百萬個句子。 以下是本教程中提到的輸出日誌,它已經越過了350k步。我看到,桶的困惑度突然增加了很多,現在很長時間以來,整個列車的困惑度一直保持在1.02,學習率也初始化爲0.5,但現在它顯示爲0.007左右,所以學習率也顯着下降。系統的輸出並不盡如人意。 如何知道是否已到達時期點,並且應該停止並重新配置參數調整和優化程序改進等設置?我如何知道在seq2seq模型中是否達到了紀元點?
global step 372800 learning rate 0.0071 step-time 1.71 perplexity 1.02 eval: bucket 0 perplexity 91819.49 eval: bucket 1 perplexity 21392511.38 eval: bucket 2 perplexity 16595488.15 eval: bucket 3 perplexity 7632624.78 global step 373000 learning rate 0.0071 step-time 1.73 perplexity 1.02 eval: bucket 0 perplexity 140295.51 eval: bucket 1 perplexity 13456390.43 eval: bucket 2 perplexity 7234450.24 eval: bucket 3 perplexity 3700941.57 global step 373200 learning rate 0.0071 step-time 1.69 perplexity 1.02 eval: bucket 0 perplexity 42996.45 eval: bucket 1 perplexity 37690535.99 eval: bucket 2 perplexity 12128765.09 eval: bucket 3 perplexity 5631090.67 global step 373400 learning rate 0.0071 step-time 1.82 perplexity 1.02 eval: bucket 0 perplexity 119885.35 eval: bucket 1 perplexity 11166383.51 eval: bucket 2 perplexity 27781188.86 eval: bucket 3 perplexity 3885654.40 global step 373600 learning rate 0.0071 step-time 1.69 perplexity 1.02 eval: bucket 0 perplexity 215824.91 eval: bucket 1 perplexity 12709769.99 eval: bucket 2 perplexity 6865776.55 eval: bucket 3 perplexity 5932146.75 global step 373800 learning rate 0.0071 step-time 1.78 perplexity 1.02 eval: bucket 0 perplexity 400927.92 eval: bucket 1 perplexity 13383517.28 eval: bucket 2 perplexity 19885776.58 eval: bucket 3 perplexity 7053727.87 global step 374000 learning rate 0.0071 step-time 1.85 perplexity 1.02 eval: bucket 0 perplexity 46706.22 eval: bucket 1 perplexity 35772455.34 eval: bucket 2 perplexity 8198331.56 eval: bucket 3 perplexity 7518406.42 global step 374200 learning rate 0.0070 step-time 1.98 perplexity 1.03 eval: bucket 0 perplexity 73865.49 eval: bucket 1 perplexity 22784461.66 eval: bucket 2 perplexity 6340268.76 eval: bucket 3 perplexity 4086899.28 global step 374400 learning rate 0.0069 step-time 1.89 perplexity 1.02 eval: bucket 0 perplexity 270132.56 eval: bucket 1 perplexity 17088126.51 eval: bucket 2 perplexity 15129051.30 eval: bucket 3 perplexity 4505976.67 global step 374600 learning rate 0.0069 step-time 1.92 perplexity 1.02 eval: bucket 0 perplexity 137268.32 eval: bucket 1 perplexity 21451921.25 eval: bucket 2 perplexity 13817998.56 eval: bucket 3 perplexity 4826017.20
什麼時候會停止?
乘以全球逐批大小和除以訓練實例的數量,這給出了您當前的時代 –
何時會停止? – stackit
不熟悉seq2seq,但通常訓練在到達訓練循環結束時結束,或者您的輸入管道用完了示例(默認情況下,您永遠不會用完) –