我有這個看似微不足道的並行快速排序的實現,代碼如下:是否有可能在Haskell中加快par的速度?
import System.Random
import Control.Parallel
import Data.List
quicksort :: [a] -> [a]
quicksort xs = pQuicksort 16 xs -- 16 is the number of sparks used to sort
-- pQuicksort, parallelQuicksort
-- As long as n > 0 evaluates the lower and upper part of the list in parallel,
-- when we have recursed deep enough, n==0, this turns into a serial quicksort.
pQuicksort :: Int -> [a] -> [a]
pQuicksort _ [] = []
pQuicksort 0 (x:xs) =
let (lower, upper) = partition (< x) xs
in pQuicksort 0 lower ++ [x] ++ pQuicksort 0 upper
pQuicksort n (x:xs) =
let (lower, upper) = partition (< x) xs
l = pQuicksort (n `div` 2) lower
u = [x] ++ pQuicksort (n `div` 2) upper
in (par u l) ++ u
main :: IO()
main = do
gen <- getStdGen
let randints = (take 5000000) $ randoms gen :: [Int]
putStrLn . show . sum $ (quicksort randints)
我編譯
ghc --make -threaded -O2 quicksort.hs
與
./quicksort +RTS -N16 -RTS
運行無論我做什麼我無法讓它比運行在一個cpu上的簡單順序實現運行得更快。
- 是否有可能解釋爲什麼它在多個CPU上的運行速度比在一個上運行速度慢得多?
- 是否有可能通過做一些技巧來使這個縮放比例(至少是線性的)與CPU的數量成正比?
編輯:@tempestadept暗示,快速排序它是自己的問題。爲了檢查這一點,我以與上例相同的精神實施了一個簡單的合併排序。它具有相同的行爲,執行速度越慢,添加的功能越多。
import System.Random
import Control.Parallel
splitList :: [a] -> ([a], [a])
splitList = helper True [] []
where helper _ left right [] = (left, right)
helper True left right (x:xs) = helper False (x:left) right xs
helper False left right (x:xs) = helper True left (x:right) xs
merge :: (Ord a) => [a] -> [a] -> [a]
merge xs [] = xs
merge [] ys = ys
merge (x:xs) (y:ys) = case x<y of
True -> x : merge xs (y:ys)
False -> y : merge (x:xs) ys
mergeSort :: (Ord a) => [a] -> [a]
mergeSort xs = pMergeSort 16 xs -- we use 16 sparks
-- pMergeSort, parallel merge sort. Takes an extra argument
-- telling how many sparks to create. In our simple test it is
-- set to 16
pMergeSort :: (Ord a) => Int -> [a] -> [a]
pMergeSort _ [] = []
pMergeSort _ [a] = [a]
pMergeSort 0 xs =
let (left, right) = splitList xs
in merge (pMergeSort 0 left) (pMergeSort 0 right)
pMergeSort n xs =
let (left, right) = splitList xs
l = pMergeSort (n `div` 2) left
r = pMergeSort (n `div` 2) right
in (r `par` l) `pseq` (merge l r)
ris :: Int -> IO [Int]
ris n = do
gen <- getStdGen
return . (take n) $ randoms gen
main = do
r <- ris 100000
putStrLn . show . sum $ mergeSort r
請注意,這實際上是一個quicksort的實現:http://stackoverflow.com/questions/7717691/why-is-the-minimalist-example-haskell-quicksort-not-a-true-quicksort – ErikR
至少我即使使用'sum'清除任何可能的thunk時,也無法使用'pseq'將其執行得更好。也許有一個完全不同的問題。 - 現在我已經通過回答刪除,這裏再次作爲評論:1。命名該函數只是'quicksort'可能會造成混淆,因爲您不希望這樣的函數接受額外的並行性參數; 2.使用類型簽名,只有_always_用於頂層函數,甚至當它們可能與名稱建議略有不同時使用; 3.如果可能的話,使用庫函數,例如'partition'。 - 好問題,順便說一句。 – leftaroundabout
我沒有足夠的時間發佈完整的答案,但我想有兩個可能的問題:(1)你應該使用'''''''''''''''''' 。 (2)雖然你並行運行子計算,但是直到需要時才進行真正的評估。所以你應該強制每個子列表到NF(或者至少是它的完整結構),像'''forceList''''' forceList''''''''forceList''擁有)強制評估列表的功能。另外爲了適當的基準測試,我建議使用[標準](http://hackage.haskell.org/package/criterion)。 –