在MPI中分割和傳遞數組塊

我在MPI中很新穎，並試圖通過編寫一個簡單的C程序來理解它的含義。我想要做的是拆分陣列併發送塊到N處理器。因此，每個處理器將在其塊中找到本地分鐘。然後程序（在根或其他地方）找到全局最小值。在MPI中分割和傳遞數組塊

我研究了MPI_Send,MPI_Isend或MPI_Bcast函數，但在使用一個而不是另一個的位置有點混淆。我需要我的程序的一般結構中的一些技巧：

在我的代碼

#include <stdio.h> 
#include <stdlib.h> 
#include <mpi.h> 

#define N 9 // array size 

int A[N] = {0,2,1,5,4,3,7,6,8}; // this is a dummy array 

int main(int argc, char *argv[]) { 

    int i, k = 0, size, rank, source = 0, dest = 1, count; 
    int tag = 1234; 

    MPI_Init(&argc, &argv); 

    MPI_Comm_size(MPI_COMM_WORLD, &size); 
    MPI_Comm_rank(MPI_COMM_WORLD, &rank); 

    count = N/(size-1); // think size = 4 for this example 

    int *tempArray = malloc(count * sizeof(int)); 
    int *localMins = malloc((size-1) * sizeof(int)); 

    if (rank == 0) { 

     for(i=0; i<size; i+=count) 
     { 
      // Is it better to use MPI_Isend or MPI_Bcast here? 
      MPI_Send(&A[i], count, MPI_INT, dest, tag, MPI_COMM_WORLD); 
      printf("P0 sent a %d elements to P%d.\n", count, dest); 
      dest++; 
     } 
    } 
    else { 

     for(i=0; i<size; i+=count) 
     {  
      MPI_Recv(tempArray, count, MPI_INT, 0, tag, MPI_COMM_WORLD, MPI_STATUS_IGNORE); 
      localMins[k] = findMin(tempArray, count); 
      printf("Min for P%d is %d.\n", rank, localMins[k]); 
      k++;    
     } 
    } 

    MPI_Finalize(); 

    int gMin = findMin(localMins, (size-1)); // where should I assign this 
    printf("Global min: %d\n", gMin); // and where should I print the results? 

    return 0; 
}

可能有多個錯誤，對不起，不能在此處指定一個確切的問題。感謝您的任何建議。

來源

2015-09-02 samet

http://stackoverflow.com/questions/10017301/mpi-blocking-vs-non-blocking？ –

您可以使用'MPI_Scatter（）'而不是分割數組並使用'MPI_Send（）'到每個進程。同樣，你也可以使用'MPI_Gather（）'來收集所有結果。 –

查看MAX/MIN/MAXLOC/MINLOC減少量。 – Jeff

您的代碼存在幾個問題（正如您已經指出的那樣），正如一些評論者已經提到的，還有其他方法可以執行您嘗試使用MPI調用進行的操作。

但是，我將重新調整您的代碼，並嘗試不要改變太多，以告訴您發生了什麼事情。

#include <stdio.h> 
#include <stdlib.h> 
#include <mpi.h> 

#define N 9 // array size 
int A[N] = {0,2,1,5,4,3,7,6,8}; // this is a dummy array that should only be initialized on rank == ROOT 

int main(int argc, char *argv[]) { 

    int size; 
    int rank; 
    const int VERY_LARGE_INT = 999999; 
    const int ROOT = 0; // the master rank that holds A to begin with 
    int tag = 1234; 

    MPI_Init(&argc, &argv); 

    MPI_Comm_size(MPI_COMM_WORLD, &size); // think size = 4 for this example 
    MPI_Comm_rank(MPI_COMM_WORLD, &rank); 

    /* 
     How many numbers you send from ROOT to each other rank. 
     Note that for this implementation to work, (size-1) must divide N. 
    */ 
    int count = N/(size-1); 

    int *localArray = (int *)malloc(count * sizeof(int)); 
    int localMin; // minimum computed on rank i 
    int globalMin; // will only be valid on rank == ROOT 

    /* rank == ROOT sends portion of A to every other rank */ 
    if (rank == ROOT) { 

     for(int dest = 1; dest < size; ++dest) 
     { 
      // If you are sending information from one rank to another, you use MPI_Send or MPI_Isend. 
      // If you are sending information from one rank to ALL others, then every rank must call MPI_Bcast (similar to MPI_Reduce below) 
      MPI_Send(&A[(dest-1)*count], count, MPI_INT, dest, tag, MPI_COMM_WORLD); 
      printf("P0 sent a %d elements to P%d.\n", count, dest); 
     } 
     localMin = VERY_LARGE_INT; // needed for MPI_Reduce below 
    } 

    /* Every other rank is receiving one message: from ROOT into local array */ 
    else { 
     MPI_Recv(localArray, count, MPI_INT, ROOT, tag, MPI_COMM_WORLD, MPI_STATUS_IGNORE); 
     localMin = findMin(localArray, count); 
     printf("Min for P%d is %d.\n", rank, localMin); 
    } 

    /* 
     At this point, every rank in communicator has valid information stored in localMin. 
     Use MPI_Reduce in order to find the global min among all ranks. 
     Store this single globalMin on rank == ROOT. 
    */ 
    MPI_Reduce(&localMin, &globalMin, 1, MPI_INT, MPI_MIN, ROOT, MPI_COMM_WORLD); 

    if (rank == ROOT) 
     printf("Global min: %d\n", globalMin); 

    /* The last thing you do is Finalize MPI. Nothing should come after. */ 
    MPI_Finalize(); 
    return 0; 
}

完全披露：我沒有測試過這個代碼，但除了小錯別字，它應該工作。

查看此代碼並查看您是否可以理解爲什麼我將您的MPI_Send和MPI_Recv調用移到了附近。要理解這一點，請注意，每個等級都在閱讀您給出的每一行代碼。因此，在您的else聲明中，不應該有for接收循環。

此外，MPI集體（如MPI_Reduce和MPI_Bcast）必須由傳播者的每個等級調用。這些調用的「源」和「目標」等級是函數輸入參數的一部分，或者是集體本身所隱含的。

最後，爲你做點功課：你能明白爲什麼這不是找到數組A的全局最小值的好實現嗎？提示：rank == ROOT在完成MPI_Send之後是做什麼的？你會如何更好地分解這個問題，以便每個職位都能更加均勻地完成工作？

來源

2015-09-02 21:03:00 NoseKnowsAll

感謝您的明確答案。這真的幫助我理解邏輯。你的代碼中唯一的錯誤是'tempArray'應該是'else'語句中的'localArray'（你忘記更新變量名）。一切正常。也感謝作業，我會努力:) – samet

我認爲我在作業中得到了重點:)'rank == ROOT'沒有做什麼！因此，更有效地使用ROOT會更好。也許'MPI_Scatter'更適合在這種情況下使用。 – samet

正確！另一種實現看起來與已有的非常相似，但具有更好的負載平衡：將問題分解爲「size」塊而不是「size-1」塊。然後，'rank == ROOT'將會有一個小塊，它可以在發送之後工作，就像其他隊伍在接收之後在他們的小塊上工作一樣。 – NoseKnowsAll

在MPI中分割和傳遞數組塊

回答

相關問題