2012-03-12 31 views
0

我正在學習思考和編碼haskell。 「最小數字贏」比賽:n人對1到n之間的數字進行投注,而只有一次下注的最小數字贏得比賽。計算所有可能遊戲的「最小數量獲勝」遊戲結果 - 內存不足

我正在計算n = 10的所有可能的投注系列並計算贏家數字。是的,這段代碼並不完全是這樣,但那不是我的觀點,而是我的代碼,它的內存耗盡速度相對較快。

(添加的註釋 - !對不起)

import Data.Array 
import Data.List 

f xs = flip map [1..10] $ flip (:) xs 
p 1 = f [] 
p n = concat $ map f $ p (n-1) 
--the above, (p n) generates the list of all possible [a1, a2, ..., an] lists, where ai=1..10 
--p 2 = [[1,1],[2,1],[3,1],[4,1],[5,1],...,[10,10] 

--my first shot at the countidens function, the functionality stays the same with the other 
--countidens2 xs = map (\x->(head x, length x)) $ group $ sort xs 

countidens' xs = accumArray (+) 0 (1,10) $ zip xs $ repeat 1 
countidens xs = filter ((/=) 0 . snd) $ zip [1..10] $ map ((countidens' xs)!) [1..10] 
--counts the number of occurrences of each number (1..10) in a list 
--countidens [1,1,1,2,2,3] = (1,3),(2,2),(3,1)] 
--(the above, countidens2 is much easier to understand) 

numlist n = map (flip (++) ([(0,0)])) $ map countidens $ p n 
--maps countidens on the (p n) list, and attaches a dummy (0,0) to the end (this is needed later) 

g (x, (y, z)) | (x==y) && (z==1) = True 
       | (x < y)    = True 
       | (y==0)    = True 
       | otherwise   = False 
-- filter function for [(a, (a,a)] lists - (a1, (a1, a)) -> Bool 

winners n = map fst $ map (head . filter g) $ map (zip [1..]) $ numlist n 
-- extracts the number of the first element of (numlist n) that qualifies as g 
-- for each element of g (note: these are results of the countidens function, since that was mapped) 
-- the dummy (0,0) was needed so there's always one that does 

winnernumsarr n = accumArray (+) 0 (1,10) $ flip zip (repeat 1) $ winners n 
-- winners n produces a simple list of integers (1..10) that is 10^n long, this (winnernumsarr) accumulates the number of each integer, much like countidens did 
-- (but does not produce a fancy output) 

main = putStrLn $ show $ winnernumsarr 7 -- aiming for 10! even 8 runs out of memory on my machine 

雖然我知道這個代碼沒有做,我想它做什麼,什麼是更重要的是,這不是我第一次遇到haskell的「內存不足」問題,並且我知道的問題可能會用C++編寫,並且只使用少量內存。

必須有一種方式 - 但如何?

+1

如果你用類型註釋每個函數,並給它一個註釋來描述它應該做什麼,這可能會有所幫助 – 2012-03-12 20:02:37

回答

1

這裏有兩件事很重要。鍵入簽名和拆箱數組。

module Main (main) where 

import Data.Array.Unboxed 
import Data.List 

f xs = flip map [1..10] $ flip (:) xs 
p 1 = f [] 
p n = concat $ map f $ p (n-1) 

--my first shot at the countidens function, the functionality stays the same with the other 
--countidens2 xs = map (\x->(head x, length x)) $ group $ sort xs 

countidens' :: [Int] -> UArray Int Int 
countidens' xs = accumArray (+) 0 (1,10) $ zip xs $ repeat 1 

countidens xs = filter ((/=) 0 . snd) $ assocs (countidens' xs) 

numlist n = map (flip (++) ([(0,0)])) $ map countidens $ p n 

g (x, (y, z)) | (x==y) && (z==1) = True 
       | (x < y)    = True 
       | (y==0)    = True 
       | otherwise   = False 

winners n = map fst $ map (head . filter g) $ map (zip [1..]) $ numlist n 

winnernumsarr :: Int -> UArray Int Int 
winnernumsarr n = accumArray (+) 0 (1,10) $ flip zip (repeat 1) $ winners n 
main = putStrLn $ show $ winnernumsarr 7 

運行在小的空間,雖然很慢(花費8約50秒,4.9秒7)。

當您使用盒裝數組時,accumArray不會向數組寫入普通數字,而是向下。在winnernumsarr,thunk變得巨大。這需要很多內存,並且需要大量的堆棧空間才能在最後進行評估。使用unboxed數組時,添加會在它們來時執行,而不是構建巨大的thunk。

類型簽名對於修復要打印的數組類型並使所有出現的數字類型Int用於較少的分配和較高的速度是必需的。

更慣用的版本,而不改變算法,是

module Main (main) where 

import Data.Array.Unboxed 
import Data.List 

p :: Int -> [[Int]] 
p 0 = [[]] 
p n = [k:xs | xs <- p (n-1), k <- [1 .. 10]] 

countidens' :: [Int] -> UArray Int Int 
countidens' xs = accumArray (+) 0 (1,10) $ map (\k -> (k,1)) xs 

countidens :: [Int] -> [(Int,Int)] 
countidens = filter ((/=) 0 . snd) . assocs . countidens' 

numlist n = map ((++[(0,0)]) . countidens) $ p n 

g :: (Int,(Int,Int)) -> Bool 
g (x, (y, z)) | (x==y) && (z==1) = True 
       | (x < y)    = True 
       | (y==0)    = True 
       | otherwise   = False 

winners :: Int -> [Int] 
winners n = map fst $ map (head . filter g) $ map (zip [1..]) $ numlist n 

winnernumsarr :: Int -> UArray Int Int 
winnernumsarr n = accumArray (+) 0 (1,10) $ map (\k -> (k,1)) $ winners n 

main :: IO() 
main = print $ winnernumsarr 7 

這也是更快。加速的一小部分來自於GHC可以更好地優化這種形式的列表生成函數p的事實,大部分來自於用map (\k -> (k,1)) xs代替zip xs (repeat 1)。我必須承認,我不明白爲什麼會產生如此大的差異,但zip必須與_ : _匹配這兩個列表,而map只需匹配xs,這可以節省一些工作量。

+0

謝謝!工作得很好。運行緩慢不是一個問題,我仍然可以繼續努力。 另外我不知道關於這些accocs的問題 - 在我的雜亂代碼中發現很好:) – 2012-03-12 20:46:28

0

我很難完全理解你的代碼是做什麼的,所以我只寫了一個函數bets,它需要玩家的數量並且吐出所有可能下注的懶惰列表。

-- `bets n` calculates all possible sequences of bets with `n` players. 
-- It returns a list of lists, each sub-list being `n` in length 
bets :: Int -> [[Int]] 
bets n = bets' n 
    where bets' :: Int -> [[Int]] -- use separate function so we always have the total `n` available 
     bets' n' 
      | n' == 0 = [[]] 
      | n' > 0 = concatMap step $ bets' (pred n') 
      | otherwise = error "bets: negative number of players" 
     step :: [Int] -> [[Int]] 
     step bs = zipWith (:) [1..n] (repeat bs) 

我用n == 5進行了測試,效果很好。不過,我不知道n == 10會帶來什麼樣的表現,所以有可能這對你來說太慢了。

0

我建議你從數組,甚至是拆箱數組切換到Vector庫。界面更加豐富,矢量的基於融合的實現通常可以帶來性能優勢。

下面是使用Vector等同版本,其中一些的丹尼爾·費舍爾的變化納入:

{-# LANGUAGE TupleSections #-} 

import qualified Data.Vector.Unboxed as V 
import Data.List 

p :: Int -> [[Int]] 
p 0 = [[]] 
p n = [k:xs | xs <- p (n-1), k <- [1 .. 10]] 

countidens' :: [Int] -> V.Vector Int 
countidens' xs = V.accum (+) (V.replicate 11 0) $ map (,1) xs 

countidens = V.filter ((/= 0) . snd) . V.indexed . countidens' 

numlist = map ((`V.snoc` (0,0)) . countidens) . p 

g (x, (y, z)) | (x==y) && (z==1) = True 
       | (x < y)    = True 
       | (y==0)    = True 
       | otherwise   = False 

winners n = map (fst . V.head . V.filter g . V.imap (\ix a -> (ix+1,a))) $ numlist n 
winnernumsarr :: Int -> V.Vector (Int,Int) 
winnernumsarr n = V.tail . V.indexed $ V.accum (+) (V.replicate 11 0) 
    $ flip zip (repeat 1) $ winners n 
main = putStrLn $ show $ winnernumsarr 8 

在我的系統,這減少從49S到31S運行時,有兩個節目有「-02 -msse2」編譯。

兩個注意事項:首先,Vector實現向量,所以如果您需要多維索引,您可能想要保留數組。其次,向量是0索引的,因此您可能需要對代碼的其餘部分進行適當的調整。

+0

其實,'accumArray'使用可變數組。這是一個'runST東西'的東西,不要複製。 – 2012-03-12 22:20:16

+0

@DanielFischer - mea culpa,我正在查看過時的數組代碼(array-0.2.0.0作爲記錄)。這就解釋了爲什麼'RTS -s'的結果不符合我的預期,但並不能解釋爲什麼矢量更快。 – 2012-03-12 23:29:10

+0

元組部分。 'map(,1)xs'比'zip xs(repeat 1)'快。對那些不想擴展的用戶(或者'map(\ k->(k,1))'),'UArray'比這裏的'Vector'代碼快。我不知道爲什麼,但這是一個小實驗顯示。 – 2012-03-12 23:47:43