2016-05-16 56 views
1

我想了解Deeplearning4j上的LSTM。我正在檢查示例的源代碼,但我無法理解這一點。Deeplearning4j LSTM示例

 //Allocate space: 
    //Note the order here: 
    // dimension 0 = number of examples in minibatch 
    // dimension 1 = size of each vector (i.e., number of characters) 
    // dimension 2 = length of each time series/example 
    INDArray input = Nd4j.zeros(currMinibatchSize,validCharacters.length,exampleLength); 
    INDArray labels = Nd4j.zeros(currMinibatchSize,validCharacters.length,exampleLength); 

我們爲什麼要存儲3D數組,這是什麼意思?

+0

什麼是您從中獲取代碼的示例文件的名稱? –

+0

https://github.com/deeplearning4j/dl4j-0.4-examples/blob/master/src/main/java/org/deeplearning4j/examples/recurrent/character/CharacterIterator.java 看看下一個方法 – Nueral

+0

Nueral - please加入Gitter的Deeplearning4j社區,他們會回答你的問題:https://gitter.im/deeplearning4j/deeplearning4j – tremstat

回答

1

好問題。但是這與LSTM的運作無關,但與任務本身有關。所以任務是預測下一個字符是什麼。下一個角色的預測本身有兩個方面:分類和近似。 如果我們只處理近似,我們只能處理一維數組。但是如果我們同時處理近似和分類,我們不能將神經網絡僅僅歸一化爲字符的ascii表示。我們需要將每個字符轉換爲數組。

舉例而言,(一個不是資本)會以這種方式來表示:

1,0,0,0,0,0,0,0,0,0,0,0,0,0 ,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 ,0,0,0,0,0,0,0,0,0,0,0

b(不是大寫)將表示爲: 0,1,0,0,0,0,0,0 ,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 ,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 c將表示爲:

0,0,1,0 ,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 ,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0

Z(z capital !!!! )

0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 ,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 ,1

所以,每個字符給我們兩個維度數組。所有這些維度是如何構建的?代碼註釋有如下解釋:

// dimension 0 = number of examples in minibatch 
    // dimension 1 = size of each vector (i.e., number of characters) 
    // dimension 2 = length of each time series/example 

我想誠懇讚揚你理解LSTM是如何工作的努力,但你指出的代碼給出例子是適用於各種神經網絡,並說明如何處理文本神經網絡中的數據,但沒有解釋LSTM如何工作。您需要查看源代碼的另一部分。

+0

@Nueral是否有意義? –