我有這樣的代碼:多線程 - 線程每個核心
#define threadsNum 4
DWORD WINAPI func(LPVOID vpParam)
{
long long sum = 0;
for(int i = 0; i < 400000/threadsNum; i++)
{
for(int j = 0; j < 160000/threadsNum; j++)
{
sum = sum > 1000 ? 0 : sum + 1;
}
}
return 1;
}
int main()
{
clock_t timer = clock();
int CPUs = 4;
DWORD_PTR threadCore = 1;
DWORD_PTR threadID = 0;
int addNum = 0;
void* *threads = new void*[threadsNum];
for (int i = 0; i < threadsNum; i++)
{
threadCore = 1 << addNum;
addNum++;
if (addNum == 4)
addNum = 0;
threads[i] = CreateThread(0, 0, func, NULL , 0, &threadID);
SetThreadAffinityMask(threads[i], threadCore);
}
if (WaitForMultipleObjects(threadsNum, threads, true, INFINITE) == WAIT_FAILED)
FatalAppExitA(NULL, "FAIL");
cout<<clock() - timer<<endl;
getchar();
return 1;
}
我有4個內核上的我的電腦。隨着threadsNum
數量的增加,時間越來越短。當threadsNum
等於4時,輸出是22325,當它是8時,輸出是11549.爲什麼?每個核心都做同樣的工作。對於threadsNum = 8
每個核心都有2個線程,當threadsNum = 4
時它們一起執行相同的工作。那麼爲什麼它更快?
線程可能被搶佔,並可能正在做一些IO ... – Theolodis