Hadoop：在MapReduce中實現嵌套for循環[Java]

我想實現一個統計公式，該公式需要將數據點與所有其他可能的數據點進行比較。例如我的數據集是一樣的東西：Hadoop：在MapReduce中實現嵌套for循環[Java]

我需要通過這個文件，如：

for (i=0;i< data.length();i++) 
    for (j=0;j< data.length();j++) 
    Sum +=(data[i] + data[j])

基本上當我通過我的地圖功能，讓每一行，我需要執行的一些指令還原器中文件的其餘部分就像嵌套for循環一樣。現在我已經嘗試使用分佈式緩存，某種形式的ChainMapper，但無濟於事。任何想法我如何能做到這一點將非常感激。即使是開箱即用的方式也會有所幫助。

來源

2014-04-30 user3587335

您能否詳細說明您的示例，請添加幾行然後用一個數據點顯示示例 – Sudarshan

就像一個簡單的示例，其中10.22是第一個點，15.77是第二個點。因此，i = 0（10.22）和j = 0（10.22），然後是1（15.77），然後是2（16.55），然後是3（9.88）。因此，對於數據集中某個點的每個值，都會遍歷數據集中所有剩餘的點。 – user3587335

因此，對於文件中的每一行，您需要遍歷整個文件，我是否正確理解了該問題？ – Sudarshan

您需要重寫Reducer類的運行方法實現。

public void run(Context context) throws IOException, InterruptedException { 
    setup(context); 
    while (context.nextKey()) { 
    //This corresponds to the ones corresponding to i of first iterator 
    Text currentKey = context.getCurrentKey(); 
    Iterator<VALUEIN> currentValue = context.getValues(); 
    if(context.nextKey()){ 
    //You can get the Next Values the ones corresponding to j of you second iterator 
    } 
} 
cleanup(context);

}

，或者如果你沒有減速，你可以做同樣的映射，以及通過重寫

public void run(Context context) throws IOException, InterruptedException { 
setup(context); 
while (context.nextKeyValue()) { 
/*context.nextKeyValue() if invoked again gives you the next key values which is same as the ones you are looking for in the second loop*/ 
} 
cleanup(context);

}

讓我知道，如果這幫助。

來源

2014-04-30 08:17:23

謝謝。當我早晨醒來時讓我測試它。 – user3587335

Hadoop：在MapReduce中實現嵌套for循環[Java]

回答

相關問題