用戶時間的增加

我運行下面的代碼：用戶時間的增加

當我運行這段代碼與1名兒童的過程：我碰到下面的定時信息：

（我運行使用的/ usr /斌/時間./job 1）

5.489u 0.090s 0：05.58 99.8％（1個作業運行）

當我的6個孩子中運行的進程：我獲得以下

74.731u 0.692s 0 ：1 2.59 599.0％（並行運行6個作業）

我正在運行實驗的機器有6個內核，198 GB的RAM，並且該機器上沒有其他任何運行的機器。

我希望用戶時間報告是6次並行運行的6次。但比這更多（13.6倍）。我的問題來自用戶時間增加的來源？是否因爲多個內核在6個作業並行運行的情況下從一個內存位置跳轉到另一個內存？或者還有一些我錯過了。

感謝

#define MAX_SIZE 7000000 
#define LOOP_COUNTER 100 

#define simple_struct struct _simple_struct 
simple_struct { 
    int n; 
    simple_struct *next; 
}; 

#define ALLOCATION_SPLIT 5 
#define CHAIN_LENGTH 1 
void do_function3(void) 
{ 
    int i = 0, j = 0, k = 0, l = 0; 
    simple_struct **big_array = NULL; 
    simple_struct *temp = NULL; 

    big_array = calloc(MAX_SIZE + 1, sizeof(simple_struct*)); 


    for(k = 0; k < ALLOCATION_SPLIT; k ++) { 
     for(i =k ; i < MAX_SIZE; i +=ALLOCATION_SPLIT) { 
      big_array[i] = calloc(1, sizeof(simple_struct)); 
      if((CHAIN_LENGTH-1)) { 
       for(l = 1; l < CHAIN_LENGTH; l++) { 
        temp = calloc(1, sizeof(simple_struct)); 
        temp->next = big_array[i]; 
        big_array[i] = temp; 
       } 
      } 
     } 
    } 

    for (j = 0; j < LOOP_COUNTER; j++) { 
     for(i=0 ; i < MAX_SIZE; i++) { 
      if(big_array[i] == NULL) { 
       big_array[i] = calloc(1, sizeof(simple_struct)); 
      } 
      big_array[i]->n = i * 13; 
      temp = big_array[i]->next; 
      while(temp) { 
       temp->n = i*13; 
       temp = temp->next; 
      } 
     } 
    } 
} 

int main(int argc, char **argv) 
{ 
    int i, no_of_processes = 0; 
    pid_t pid, wpid; 
    int child_done = 0; 
    int status; 
    if(argc != 2) { 
     printf("usage: this_binary number_of_processes"); 
     return 0; 
    } 

    no_of_processes = atoi(argv[1]); 

    for(i = 0; i < no_of_processes; i ++) { 
     pid = fork(); 

     switch(pid) { 
      case -1: 
       printf("error forking"); 
       exit(-1); 
      case 0: 
       do_function3(); 
       return 0; 
      default: 
       printf("\nchild %d launched with pid %d\n", i, pid); 
       break; 
     } 
    } 

    while(child_done != no_of_processes) { 
     wpid = wait(&status); 
     child_done++; 
     printf("\nchild done with pid %d\n", wpid); 
    } 

    return 0; 
}

來源

2016-03-07 sourav mahmood

CPU是否共享內存帶寬？ – user3528438

首先，你的基準是有點不尋常。通常情況下，基準測試併發應用程序的時候，人會比較兩種實現方式：

單個線程版本解決大小S的問題;
具有N個線程的多線程版本，協同解決大小爲S的問題;在你的情況下，每個解決S/N大小的問題。

然後你劃分執行時間以獲得speedup。

如果你的加速是：

約1：並行執行具有性能的單線程執行類似;
高於1（通常在1和N之間），並行化應用程序會提高性能;
低於1：並行化應用程序會損害性能。

對性能的影響取決於多種因素：

如何以及你的算法可以並行。見Amdahl's law。這裏不適用。
線程間通信開銷。這裏不適用。
線程間同步中的開銷。這裏不適用。
爭用CPU資源。不應該在這裏應用（因爲線程的數量等於核心的數量）。然而超線程可能會受到傷害。
爭用內存緩存。由於線程不共享內存，這會降低性能。
訪問主存的爭用。這會降低性能。

您可以使用profiler來測量最後2個。查找緩存未命中和停頓的指示。

來源

2016-03-07 22:22:03 o9000

用戶時間的增加

回答

相關問題