posix線程和O3優化

我正在使用mpi（openmpi 1.4.3）和pthreads，在linux下使用C++工作的程序。posix線程和O3優化

一些mpi節點有一個用pthreads實現的排隊系統。想法很簡單，一個線程將元素添加到隊列中，其他幾個「工作」線程拾取對象並在其上進行工作（而不是火箭科學）。

請考慮我的工作線程拾取元素的2個例子。第一個示例正常工作，除非指定-O3優化。在那種情況下，它會開始無休止地循環而不會收集任何東西。

while (true){ 
     if (t_exitSignal[tID]){ 
      dorun = false; 
      break; 
     } 

     //cout<<"w8\n"; 

     //check if queue has some work for us 
     if (!frame_queue->empty()){ 

      //try to get lock and recheck that queue no empty 
      pthread_mutex_lock(&mutex_frame_queue); 

      if (!frame_queue->empty()){ 
       cout<<"Pickup "<<tID<<endl; 
       con = frame_queue->front(); 
       frame_queue->pop(); 
       t_idling[tID] = false; 
       pthread_mutex_unlock(&mutex_frame_queue); 
       break; 
      } 

      pthread_mutex_unlock(&mutex_frame_queue); 
     } 

    }

現在考慮這一個，完全相同的代碼，除了在檢查queue-> empthy之前鎖定互斥鎖gettimg。這項工作適用於所有優化級別。

while (true){ 
     if (t_exitSignal[tID]){ 
      dorun = false; 
      break; 
     } 
     //cout<<"w8\n"; 

     //try to get lock and recheck that queue no empty 
     pthread_mutex_lock(&mutex_frame_queue); 

     //check if queue has some work for us 
     if (!frame_queue->empty()){ 

       cout<<"Pickup "<<tID<<endl; 
       con = frame_queue->front(); 
       frame_queue->pop(); 
       t_idling[tID] = false; 
       pthread_mutex_unlock(&mutex_frame_queue); 
       break; 

     } 
     pthread_mutex_unlock(&mutex_frame_queue); 

    }

萬一有差別，這是我是如何從其他線程隊列填充

    pthread_mutex_lock(&mutex_frame_queue); 
      //adding the same contianer into queue to make it available for threads 
      frame_queue->push(*cursor); 
      pthread_mutex_unlock(&mutex_frame_queue);

我的問題是：爲什麼代碼停止的第一個例子工作，爲什麼我用-O3選項進行編譯？排隊系統的其他建議？

非常感謝！

解決方案：這是我在最後提出的。似乎比上述任何一種方法都要好得多。（以防萬一有人問津;）

while (true){ 

     if (t_exitSignal[tID]){ 

      dorun = false; 
      break; 
     } 
     //try to get lock and check that queue no empty 
     pthread_mutex_lock(&mutex_frame_queue); 

     if (!frame_queue->empty()){ 

      con = frame_queue->front(); 
      frame_queue->pop(); 
      t_idling[tID] = false; 
      pthread_mutex_unlock(&mutex_frame_queue); 
      break; 
     }else{ 

      pthread_cond_wait(&conf_frame_queue, &mutex_frame_queue); 
      pthread_mutex_unlock(&mutex_frame_queue); 
     } 




    }

添加

 pthread_mutex_lock(&mutex_frame_queue); 

     //adding the same contianer into queue to make it available for threads 
     frame_queue->push(*cursor); 
     //wake up any waiting threads 
     pthread_cond_signal(&conf_frame_queue); 
     pthread_mutex_unlock(&mutex_frame_queue)

來源

2011-07-28 kirbo

我猜你看到基於對指令排序假設的錯誤，當您檢查隊列爲空 - 當你打開通過優化排序更改，它會中斷，因爲您擁有的互斥鎖沒有設置內存屏障來防止發生這種情況。

來源

2011-07-28 23:16:12 Josh

我很想建議__sync_synchronize()第一個空檢查之前，但是這可能並不安全，如果另一個線程在加入到容器的中部，該容器仍可能處於不一致的狀態，當你調用empty()。取決於容器，任何事情都可能發生，從錯誤的回答到崩潰。

Josh也許是對的。鎖定互斥鎖還提供了內存屏障，這意味着您的代碼將重新讀取正在使用的內存，以確定每次容器是否爲空。如果沒有某種內存障礙，那實際上從來沒有保證會發生，所以在更高的優化級別上，代碼可能永遠不會看到改變。

另外，你看過pthread的condition variables？他們將允許您避免循環查詢，直到您的容器不再爲空。

來源

2011-07-28 23:30:18 LnxPrgr3

posix線程和O3優化

回答

相關問題