有人可以解釋爲什麼parLapply是最慢的?爲什麼doParallel R在這種情況下表現更差?
> cl <- makeCluster(no_cores)
> myVar<-2:4000
> microbenchmark(
+ Reduce("+",parLapply(cl,myVar,function(X) X^2)),
+ Reduce("+",lapply(myVar,function(X) X^2)),
+ Reduce("+",myVar^2)
+)
Unit: milliseconds
expr min lq mean median uq max neval
Reduce("+", parLapply(cl, myVar, function(X) X^2)) 6.988662 8.041860 9.061966 8.901447 9.916621 14.541828 100
Reduce("+", lapply(myVar, function(X) X^2)) 5.256892 5.626853 6.892995 6.259239 8.165724 11.112812 100
Reduce("+", myVar^2) 1.930513 2.137887 2.613923 2.279481 3.000740 6.194623 100
基礎上的意見,我添加了一個總和實現和矢量實現:
> vec_exp<-Vectorize(function(x)x^2)
> cl <- makeCluster(no_cores)
> myVar<-2:4000
> microbenchmark(
+ Reduce("+",parLapply(cl,myVar,function(X) X^2)),
+ Reduce("+",lapply(myVar,function(X) X^2)),
+ Reduce("+",myVar^2),
+ sum(myVar^2),
+ Reduce("+",vec_exp(myVar))
+)
Unit: microseconds
expr min lq mean median uq max neval
Reduce("+", parLapply(cl, myVar, function(X) X^2)) 6880.426 7086.400 7589.02901 7253.886 7625.246 12055.674 100
Reduce("+", lapply(myVar, function(X) X^2)) 5073.078 5356.030 5826.33276 5478.029 5728.324 8472.236 100
Reduce("+", myVar^2) 1922.582 1998.861 2174.07136 2041.548 2129.023 4427.864 100
sum(myVar^2) 13.530 17.495 19.65554 18.662 20.528 34.990 100
Reduce("+", vec_exp(myVar)) 5686.102 5967.655 6632.46879 6210.952 6671.186 16191.488 100
只有並行化操作比複製數據昂貴得多,並行化纔有意義。 x^2或多或少與複製x相同的成本 - 這不是並行化的合適用例。你只是產生了很多開銷。 –
由於這是高度可伸縮的比較時間與'sum(myVar^2)'。另請參見@ user2589273答案 – Rentrop