多線程執行時間與隨機數的總和

我正在嘗試創建一個多線程程序，它將N個隨機數[-100,100]的數組與一個由程序員實現的自旋鎖（忙等待）序列化的K工作線程相加。在我嘗試使用隨機數之前，爲了測試目的，我用1代碼初始化了整個數組，就像我在代碼中看到的一樣。因爲我完全不知道自己在哪裏的問題，我會後的完整代碼：多線程執行時間與隨機數的總和

#include <iostream> 
#include <string.h> 
#include <pthread.h> 
#include <cstdlib> 
#include <time.h> 
#include <atomic> 
#include <chrono> 

using namespace std; 
using namespace chrono; 

struct lock { 

    long double sum = 0; 
    atomic_flag m_flag = ATOMIC_FLAG_INIT; // Inicializa com m_flag = 0 

    void acquire() { 
     while(m_flag.test_and_set()); 
    } 
    void release() { 
     m_flag.clear(); 
    } 
}; 

struct t_data{ 
    int t_id; 
    char* sumArray; 
    struct lock* spinlock; 
}; 

void* sum(void* thread_data) { 

    struct t_data *my_data; 
    long double m_sum=0; 
    my_data = (struct t_data *) thread_data; 

    for (int i=0;i<strlen(my_data->sumArray);i++) { 
     m_sum += my_data->sumArray[i]; 
    } 

    my_data->spinlock->acquire(); 
    cout << "THREAD ID: " << my_data->t_id << endl; 
    cout << "Acquired lock." << endl; 
    my_data->spinlock->sum += m_sum; 
    cout << "Releasing lock..." << endl << endl; 
    my_data->spinlock->release(); 

} 

int main(int argc, char** argv) { 

    // Inicializar cronômetro, arrays, spinlock,etc.                                                                                                                                                       , spinlock, etc. 
    system_clock::time_point starting_time = system_clock::now(); 
    int K = atoi(argv[1]); 
    int N = atoi(argv[2]); 
    int temp; 
    double expected_sum = 0; 
    pthread_t threads[K]; 
    struct t_data threads_data[K]; 
    struct lock spinlock; 
    const long int numElements = (long int) N/K; //Divisão inteira de N/K para dividir array em parcelas 

    // Criar array[K] de arrays para delegar cada sub-lista a uma thread 
    char** numArrays = new char*[K]; 
    for(int i=0;i<K;i++) 
     numArrays[i] = new char[numElements]; //Char utilizado para que seja alocado apenas 1 byte por número 

    // Inicializar seed aleatória para preenchimento de arrays 
    srand(time(NULL)); 

    //Preencher arrays que serão passados às threads criadas 
    for (int i=0;i<K;i++) { 
     for(int j=0;j<numElements;j++) { 
      temp = 1;//rand() % 201 - 100; (CHANGING THIS GIVES UNEXPECTED RESULTS) 
      numArrays[i][j] = temp; 
      expected_sum+=temp; 
     } 
     //Criar threads e passando argumentos(id,spinlock,array) 
     threads_data[i].t_id = i; 
     threads_data[i].spinlock = &spinlock; 
     threads_data[i].sumArray = numArrays[i]; 
     pthread_create(&threads[i],NULL,sum,(void*)&threads_data[i]); 
    } 

    // Parar o programa até que todas as threads terminem para imprimir soma correta 
    for (int i=0;i<K;i++){ 
     if(pthread_join(threads[i],NULL)) cout << "Error waiting for threads." << endl; 
    } 

    // Somando últimos valores restantes no caso de N%K != 0 (esta parcela torna-se irrelevante à medida que N >> K) 
    for(int i=0;i<(int)N%K;i++) { 
     temp = 1;//rand() % 201 - 100; (CHANGING THIS GIVES UNEXPECTED RESULTS) 
     spinlock.sum+=temp; 
     expected_sum+=temp; 
    } 

    // Printar resultado esperado, o calculado e tempo de execução 
    cout << "EXPECTED SUM = " << expected_sum << endl; 
    cout << "CALCULATED SUM = " << spinlock.sum << endl; 

    // Liberar memória alocada 
    for(int i=0;i<K;i++) 
     delete[] numArrays[i]; 

    delete[] numArrays; 

    auto start_ms = time_point_cast<milliseconds>(starting_time); 
    auto now = system_clock::now(); 
    auto now_ms = time_point_cast<milliseconds>(now); 
    auto value = now_ms - start_ms; 
    long execution_time = value.count(); 
    cout << "-----------------------" << endl; 
    cout << "Execution time: " << execution_time << "ms" << endl; 
    return 0; 
}

這很好地工作在計算總和，但提出與執行時間的問題：它應該線性縮放（N/K），但在測試對於K = 10，N =10⁶：

EXPECTED SUM = 1e+06 
CALCULATED SUM = 1e+06 
----------------------- 
Execution time: 1310ms

且k = 10，N = 2 *10⁶：

EXPECTED SUM = 2e+06 
CALCULATED SUM = 2e+06 
----------------------- 
Execution time: 7144ms

我不知道爲什麼發生這種情況。它應該加倍。更改K正常工作。另外，如果我使用rand() % 201-100而不是1件事情真的搞砸了。對於K = 10，N =10⁶：

EXPECTED SUM = -16307 
CALCULATED SUM = 1695 
----------------------- 
Execution time: 95ms

和關於執行時間的變化中，N（線性尺度）固定的，而是ķ沒有差別了。這些對我來說都沒有意義。

在此先感謝！

來源

2016-05-17 Gabriel Rebello

strlen(my_data->sumArray)將在字符數組/ C-串止步於第一0，而你繼續爲expected_sumtemp值總結。使用非ASCII數據vector（這是C++畢竟）：

// use a vector in t_data 
struct t_data{ 
    int t_id; 
    std::vector<char> sumArray; 
    lock* spinlock; 
}; 

// adjust summing up in sum(void* thread_data) 
for (char value : my_data->sumArray) { 
    m_sum += value; 
} 

// initialise like this 
threads_data[i].sumArray.resize(numElements); 
for(size_t j = 0; j < threads_data[i].sumArray.size(); ++j) { 
    char temp = 1; //or (char)(rand() % 201 - 100); 
    threads_data[i].sumArray[j] = temp; 
    expected_sum += temp; 
}

現在考慮你計時什麼：移動的threads_data[i]和expected_sum以外的時間區域的初始化，否則數以百萬計的rand電話肯定會支配一切。在任何情況下，您都在測量連續版本以及並行版本，因此您不能期望K能夠在時間上有所作爲：您始終至少測量順序版本+最後一個並行版本（加入時）。

來源

2016-05-17 23:33:56 BeyelerStudios

當你用'char'表示'int'時，不知道'strlen'有這個問題。至於執行時間，你是對的。我開始只跟蹤並行執行，現在'K'和'N'都是線性的時間刻度。謝謝！ –

多線程執行時間與隨機數的總和

回答

相關問題