Java - 多線程一個大循環

這可能是一個非常簡單的問題，但是因爲我從未與線程一起工作過，所以我認爲最好問問題而不是嘗試完全依靠自己找到最佳解決方案。Java - 多線程一個大循環

我有一個巨大的for循環運行幾十億次。在循環運行中，根據當前的index，程序以數字形式計算最終結果。我只對存儲頂部result（或頂部x個結果）及其相應的索引感興趣。

我的問題很簡單，在線程中運行此循環的正確方式是什麼，因此它使用所有可用的CPU /內核。

int topResultIndex; 
double topResult = 0; 

for (i=1; i < 1000000000; ++i) { 
    double result = // some complicated calculation based on the current index 
    if (result > topResult) { 
     topResult = result; 
     topResultIndex = i; 
    } 
}

該計算是完全獨立的每個索引，沒有資源共享。每個線程都會明顯訪問topResultIndex和topResult。

*更新：Giulio's和rolfl的解決方案都很好，也非常相似。只能接受其中之一作爲我的回答。

來源

2013-11-03 SportySpice

計算是否獨立於每個索引或將要共享資源進行計算？ – Jeffrey

對於每個索引，計算是完全獨立的 – SportySpice

如果循環受CPU限制，多線程*將*增加其速度（按Amdahl定律給出的一個因子）。如果瓶頸是內存，它將無法工作（因爲多線程不會使內存運行得更快） –

假設這個結果是由calculateResult(long)方法，它是私有的，靜態的計算，並且不訪問任何靜態字段，（也可以是非靜態的，但它仍然必須是線程安全的，併發可執行的，希望線程受限）。

然後，我認爲這會做骯髒的工作：

public static class Response { 
    int index; 
    double result; 
} 

private static class MyTask implements Callable<Response> { 
    private long from; 
    private long to; 

    public MyTask(long fromIndexInclusive, long toIndexExclusive) { 
     this.from = fromIndexInclusive; 
     this.to = toIndexExclusive; 
    } 

    public Response call() { 
     int topResultIndex; 
     double topResult = 0; 

     for (long i = from; i < to; ++i) { 
      double result = calculateResult(i); 
      if (result > topResult) { 
       topResult = result; 
       topResultIndex = i; 
      } 
     } 

     Response res = new Response(); 
     res.index = topResultIndex; 
     res.result = topResult; 
     return res; 
    } 
}; 

private static calculateResult(long index) { ... } 

public Response interfaceMethod() { 
    //You might want to make this static/shared/global 
    ExecutorService svc = Executors.newCachedThreadPool(); 

    int chunks = Runtime.getRuntime().availableProcessors(); 
    long iterations = 1000000000; 
    MyTask[] tasks = new MyTask[chunks]; 
    for (int i = 0; i < chunks; ++i) { 
     //You'd better cast to double and round here 
     tasks[i] = new MyTask(iterations/chunks * i, iterations/chunks * (i + 1)); 
    } 

    List<Future<Response>> resp = svc.invokeAll(Arrays.asList(tasks)); 
    Iterator<Future<Response>> respIt = resp.iterator(); 

    //You'll have to handle exceptions here 
    Response bestResponse = respIt.next().get(); 

    while (respIt.hasNext()) { 
     Response r = respIt.next().get(); 
     if (r.result > bestResponse.result) { 
      bestResponse = r; 
     } 
    } 

    return bestResponse; 
}

從我的經驗，這種劃分成塊的速度要快得多，有一個任務，每個索引（尤其是如果每個單項指標的計算負載是小的，就像它可能是小的，我的意思是不到半秒）。然而，編碼有點困難，因爲你需要進行兩步最大化（首先是塊級，然後是全局級）。有了這個，如果計算純粹是基於cpu的（不推動ram太多），你應該得到的加速幾乎等於物理內核數量的80％。

來源

2013-11-03 02:47:01

'calculateResult（）'包含一個包含一點點等待的網絡請求（如第二個或兩個）的情況怎麼樣？你會爲每個索引分配一個任務嗎？ – splinter123

仍然取決於您需要製作多少個請求。一般來說，我會在任務和索引之間建立1：1的關係，但是你不想用等待的線程吞噬你的操作系統。在普通臺式機/筆記本電腦上，對於網絡或硬盤驅動器請求，我會避免從一次調用中產生超過20-30個線程。還要考慮你需要等待的設備有其自身的限制，將太多的請求放在它上面也無濟於事。無論如何，當任務是IO限制時，任務的數量可能遠遠大於CPU的數量 –

最簡單的方法是使用ExecutorService並將您的任務作爲Runnable或Callable提交。您可以使用Executors.newFixedThreadPool(Runtime.getRuntime().availableProcessors())創建一個ExeuctorService，它將使用與處理器相同數量的線程。

來源

2013-11-03 02:26:30 Jeffrey

好吧，但我的「任務」是什麼。我應該分開循環嗎？例如，如果我在2個處理器上運行一個從0到500000000，另一個從500000001到1000000000？還是有更合適的解決方案？或者你的意思是每個循環運行本身將是一個新的可運行？（聽起來不那麼聰明） – SportySpice

你的任務是漫長而複雜的計算。製作一個像getResult（int index）這樣的方法，並將其用作你的任務。 – Jeffrey

除了觀察到使用OpenMP或某些其他並行計算擴展的C程序會是一個更好的想法之外，Java的方式是創建一個'Future'Task來計算問題的一個子集：

private static final class Result { 
    final int index; 
    final double result; 
    public Result (int index, double result) { 
     this.result = result; 
     this.index = index; 
    } 
} 

// Calculate 10,000 values in each thead 
int steps = 10000; 
int cpucount = Runtime.getRuntime().availableProcessors(); 
ExecutorService service = Executors.newFixedThreadPool(cpucount); 
ArrayList<Future<Result>> results = new ArrayList<>(); 
for (int i = 0; i < 1000000000; i+= steps) { 
    final int from = i; 
    final int to = from + steps; 
    results.add(service.submit(new Callable<Result>() { 
     public Result call() { 
       int topResultIndex = -1; 
       double topResult = 0; 
       for (int j = from; j < to; j++) { 
        // do complicated things with 'j' 
         double result = // some complicated calculation based on the current index 
         if (result > topResult) { 
          topResult = result; 
          topResultIndex = j; 
         } 
       } 
       return new Result(topResultIndex, topResult); 
     } 
    }); 
} 

service.shutdown(); 
while (!service.isTerminated()) { 
    System.out.println("Waiting for threads to complete"); 
    service.awaitTermination(10, TimeUnit.SECONDS); 
} 
Result best = null; 
for (Future<Result> fut : results) { 
    if (best == null || fut.result > best.result) { 
     best = fut; 
    } 
} 

System.out.printf("Best result is %f at index %d\n", best.result, best.index); 

Future<Result>

來源

2013-11-03 02:34:19 rolfl

非常感謝您的深入解決方案。你認爲這會在C/C++中運行得更好嗎？我的整個程序都是用java編寫的，但我真的想過用C++編寫程序的這一部分，如果它確實會有顯着的改進。 – SportySpice

實際上，調用'Runtime.getRuntime（）。availableProcessors（）'將返回沒有任何外部庫的擁有者數 – vandale

編輯，感謝vandale。 @SportySpice - 這取決於情況。 Java JIT可能能夠在您的特定情況下進行緊密競爭。 – rolfl

Java - 多線程一個大循環

回答

相關問題