Haskell中的記憶？

222

我們可以非常有效地做到這一點，使我們可以在亞線性時間索引的結構。

但首先，

{-# LANGUAGE BangPatterns #-} 

import Data.Function (fix)

讓我們來定義f，但要使用「開放遞歸」，而不是直接調用本身。

f :: (Int -> Int) -> Int -> Int 
f mf 0 = 0 
f mf n = max n $ mf (n `div` 2) + 
       mf (n `div` 3) + 
       mf (n `div` 4)

您可以通過使用fix f

得到unmemoized f這將讓你測試f做你的意思爲f小的值通過調用，例如：fix f 123 = 144

我們可以memoize的這通過定義：

f_list :: [Int] 
f_list = map (f faster_f) [0..] 

faster_f :: Int -> Int 
faster_f n = f_list !! n

這表現得很好，而且repl王牌什麼是要採取O（n^3）時間記憶中間結果的東西。

但是，它仍然需要線性時間來索引以找到mf的記憶答案。這意味着結果是這樣的：

*Main Data.List> faster_f 123801 
248604

是可以容忍的，但結果並沒有比這個好得多。我們可以做得更好！

首先，讓我們定義一個無限樹：

data Tree a = Tree (Tree a) a (Tree a) 
instance Functor Tree where 
    fmap f (Tree l m r) = Tree (fmap f l) (f m) (fmap f r)

然後，我們將定義一個方法來索引，所以我們可以在O（log n）的時間找到索引n節點而不是：

index :: Tree a -> Int -> a 
index (Tree _ m _) 0 = m 
index (Tree l _ r) n = case (n - 1) `divMod` 2 of 
    (q,0) -> index l q 
    (q,1) -> index r q

...我們會發現一個完整的自然數的樹要方便，所以我們不必擺弄那些索引：

nats :: Tree Int 
nats = go 0 1 
    where 
     go !n !s = Tree (go l s') n (go r s') 
      where 
       l = n + s 
       r = l + s 
       s' = s * 2

既然我們能指數，你可以轉換一棵樹到一個列表：

toList :: Tree a -> [a] 
toList as = map (index as) [0..]

您可以通過驗證toList nats給你[0..]

現在檢查工作至今，

f_tree :: Tree Int 
f_tree = fmap (f fastest_f) nats 

fastest_f :: Int -> Int 
fastest_f = index f_tree

的工作方式與上面的列表類似，但不是花費線性時間來查找每個節點，而是可以在對數時間內追蹤它。

結果是相當快：

*Main> fastest_f 12380192300 
67652175206 

*Main> fastest_f 12793129379123 
120695231674999

事實上，它是如此之快，你可以去通過，並與Integer上述替代Int並獲得大的離譜的答案几乎是瞬間

*Main> fastest_f' 1230891823091823018203123 
93721573993600178112200489 

*Main> fastest_f' 12308918230918230182031231231293810923 
11097012733777002208302545289166620866358

來源

2010-07-09 01:12:48

+3

我試過這段代碼，有趣的是，f_faster似乎比f慢。我想這些列表引用真的放慢了速度。 nats和index的定義對我來說似乎很神祕，所以我添加了我自己的答案，這可能會讓事情變得更加清晰。 – Pitarou 2012-06-16 04:41:33

+0

@EdwardKmett我花了數小時的時間學習/研究這是如何工作的，它非常聰明。但是我不能找到的是，爲什麼無限列表需要更多的記憶，然後無限樹？例如，如果您在觀看ghci的內存使用情況時調用「speeds_f 111111111」，則可以看到它幾乎無用。但是當你調用faster_f 111111111時，它使用大約1.5GB，然後ghci結束，因爲我內存不足。我已經使用ghci's：set + s測試了他們的後續調用，並且faster_f確實將速度提高到幾乎沒有，而且faster_f也提高了。發生什麼了？ – QuantumKarl 2013-12-08 22:01:14

+5

無限列表案例必須處理長111111111項的鏈表。樹狀結構處理的是log n *達到的節點數量。 – 2013-12-17 07:15:31

11

不是最有效的方式，但確實memoize的：

f = 0 : [ g n | n <- [1..] ] 
    where g n = max n $ f!!(n `div` 2) + f!!(n `div` 3) + f!!(n `div` 4)

請求 f !! 144時

，檢查出f !! 143存在，但不計算其精確值。它仍然是一些未知的計算結果。計算出的唯一精確值是需要的值。

所以最初，只要計算了多少，程序一無所知。

f = ....

當我們請求f !! 12，它開始做一些模式匹配：

f = 0 : g 1 : g 2 : g 3 : g 4 : g 5 : g 6 : g 7 : g 8 : g 9 : g 10 : g 11 : g 12 : ...

現在開始計算

f !! 12 = g 12 = max 12 $ f!!6 + f!!4 + f!!3

這遞歸使得F上的另一個需求，所以我們計算

f !! 6 = g 6 = max 6 $ f !! 3 + f !! 2 + f !! 1 
f !! 3 = g 3 = max 3 $ f !! 1 + f !! 1 + f !! 0 
f !! 1 = g 1 = max 1 $ f !! 0 + f !! 0 + f !! 0 
f !! 0 = 0

現在我們可以滴入備份一些

f !! 1 = g 1 = max 1 $ 0 + 0 + 0 = 1

這意味着該計劃現在知道：

f = 0 : 1 : g 2 : g 3 : g 4 : g 5 : g 6 : g 7 : g 8 : g 9 : g 10 : g 11 : g 12 : ...

繼續淌了起來：

f !! 3 = g 3 = max 3 $ 1 + 1 + 0 = 3

這意味着現在的程序知道：

f = 0 : 1 : g 2 : 3 : g 4 : g 5 : g 6 : g 7 : g 8 : g 9 : g 10 : g 11 : g 12 : ...

現在我們將繼續我們的f!!6計算：

f !! 6 = g 6 = max 6 $ 3 + f !! 2 + 1 
f !! 2 = g 2 = max 2 $ f !! 1 + f !! 0 + f !! 0 = max 2 $ 1 + 0 + 0 = 2 
f !! 6 = g 6 = max 6 $ 3 + 2 + 1 = 6

這意味着該計劃現在知道：

f = 0 : 1 : 2 : 3 : g 4 : g 5 : 6 : g 7 : g 8 : g 9 : g 10 : g 11 : g 12 : ...

現在我們將繼續我們的f!!12計算：

f !! 12 = g 12 = max 12 $ 6 + f!!4 + 3 
f !! 4 = g 4 = max 4 $ f !! 2 + f !! 1 + f !! 1 = max 4 $ 2 + 1 + 1 = 4 
f !! 12 = g 12 = max 12 $ 6 + 4 + 3 = 13

這意味着該程序現在知道：

f = 0 : 1 : 2 : 3 : 4 : g 5 : 6 : g 7 : g 8 : g 9 : g 10 : g 11 : 13 : ...

所以計算是相當懶散地完成的。該程序知道存在f !! 8的某個值，它等於g 8，但它不知道g 8是什麼。

來源

2010-07-08 22:00:36 rampion

+0

謝謝你。我對Haskell仍然很陌生，所以我的答案中有很多東西需要理解，但我會嘗試。 – 2010-07-08 22:03:51

+0

謝謝你這個。你將如何創建和使用2維解決方案空間？這是列表清單嗎？和'g n m =（某物）f !! a !! b' – vikingsteve 2014-01-06 08:21:12

+1

當然，你可以。不過，對於真正的解決方案，我可能會使用memoization庫，如[memocombinators]（http://ocharles.org.uk/blog/posts/2013-12-08-24-days-of-hackage-data -memocombinators.html） – rampion 2014-01-07 03:14:08

7

這是Edward Kmett的出色答案的附錄。

當我嘗試他的代碼時，nats和index的定義看起來很神祕，所以我編寫了一個我覺得更容易理解的替代版本。

根據index'和nats'定義index和nats。

index' t n定義在範圍[1..]。（回想一下，index t定義在範圍[0..]上。）它的工作原理是將n視爲一串比特，然後反向讀取比特。如果該位是1，則需要右手分支。如果該位是0，它將採用左側分支。它在到達最後一位時停止（它必須是1）。

index' (Tree l m r) 1 = m 
index' (Tree l m r) n = case n `divMod` 2 of 
          (n', 0) -> index' l n' 
          (n', 1) -> index' r n'

正如nats爲index定義，以便index nats n == n始終是真實的，nats'爲index'定義。

nats' = Tree l 1 r 
    where 
    l = fmap (\n -> n*2)  nats' 
    r = fmap (\n -> n*2 + 1) nats' 
    nats' = Tree l 1 r

現在，nats和index只是nats'和index'但與1移值：

index t n = index' t (n+1) 
nats = fmap (\n -> n-1) nats'

來源

2012-06-16 04:59:35 Pitarou

+0

謝謝。我正在記憶一個多元函數，這真的幫助我確定了索引和nats實際上在做什麼。 – Kittsil 2017-03-03 05:55:31

16

Edward's answer是這樣一個美妙的寶石，我已經複製並提供memoList實現和memoTree以開放遞歸形式記憶函數的組合器。

{-# LANGUAGE BangPatterns #-} 

import Data.Function (fix) 

f :: (Integer -> Integer) -> Integer -> Integer 
f mf 0 = 0 
f mf n = max n $ mf (div n 2) + 
       mf (div n 3) + 
       mf (div n 4) 


-- Memoizing using a list 

-- The memoizing functionality depends on this being in eta reduced form! 
memoList :: ((Integer -> Integer) -> Integer -> Integer) -> Integer -> Integer 
memoList f = memoList_f 
    where memoList_f = (memo !!) . fromInteger 
     memo = map (f memoList_f) [0..] 

faster_f :: Integer -> Integer 
faster_f = memoList f 


-- Memoizing using a tree 

data Tree a = Tree (Tree a) a (Tree a) 
instance Functor Tree where 
    fmap f (Tree l m r) = Tree (fmap f l) (f m) (fmap f r) 

index :: Tree a -> Integer -> a 
index (Tree _ m _) 0 = m 
index (Tree l _ r) n = case (n - 1) `divMod` 2 of 
    (q,0) -> index l q 
    (q,1) -> index r q 

nats :: Tree Integer 
nats = go 0 1 
    where 
     go !n !s = Tree (go l s') n (go r s') 
      where 
       l = n + s 
       r = l + s 
       s' = s * 2 

toList :: Tree a -> [a] 
toList as = map (index as) [0..] 

-- The memoizing functionality depends on this being in eta reduced form! 
memoTree :: ((Integer -> Integer) -> Integer -> Integer) -> Integer -> Integer 
memoTree f = memoTree_f 
    where memoTree_f = index memo 
     memo = fmap (f memoTree_f) nats 

fastest_f :: Integer -> Integer 
fastest_f = memoTree f

來源

2013-02-24 23:18:57

1

另一個增編愛德華Kmett的回答是：一個自包含的例子：

fib = memoNat f 
    where f 0 = 0 
     f 1 = 1 
     f n = fib (n-1) + fib (n-2)

：

data NatTrie v = NatTrie (NatTrie v) v (NatTrie v) 

memo1 arg_to_index index_to_arg f = (\n -> index nats (arg_to_index n)) 
    where nats = go 0 1 
     go i s = NatTrie (go (i+s) s') (f (index_to_arg i)) (go (i+s') s') 
      where s' = 2*s 
     index (NatTrie l v r) i 
      | i < 0 = f (index_to_arg i) 
      | i == 0 = v 
      | otherwise = case (i-1) `divMod` 2 of 
      (i',0) -> index l i' 
      (i',1) -> index r i' 

memoNat = memo1 id id

如下，以memoize的用一個整數ARG的功能（如斐波那契）使用它

只有非負參數的值纔會被緩存。

要也爲負參數緩存值，使用memoInt，定義如下：

memoInt = memo1 arg_to_index index_to_arg 
    where arg_to_index n 
     | n < 0  = -2*n 
     | otherwise = 2*n + 1 
     index_to_arg i = case i `divMod` 2 of 
      (n,0) -> -n 
      (n,1) -> n

要高速緩存的值對於具有兩個整數參數使用memoIntInt，函數定義如下：

memoIntInt f = memoInt (\n -> memoInt (f n))

來源

2014-01-26 01:20:08

7

作爲在Edward Kmett的回答中指出，爲了加快速度，您需要緩存昂貴的計算並能夠快速訪問它們。

爲了保持函數非單調性，使用合適的方式來索引它（如前面的帖子所示），構建一棵無限延遲樹的解決方案實現了該目標。如果放棄函數的非單調性質，可以將Haskell中的標準關聯容器與「狀態」單元（如State或ST）結合使用。

雖然主要缺點是您獲得了非一元函數，您不必再自己索引結構，並且可以使用關聯容器的標準實現。

要做到這一點，首先需要重新寫你函數接受任何類型的單子：

fm :: (Integral a, Monad m) => (a -> m a) -> a -> m a 
fm _ 0 = return 0 
fm recf n = do 
    recs <- mapM recf $ div n <$> [2, 3, 4] 
    return $ max n (sum recs)

對於你的測試，你還可以定義不使用Data.Function沒有記憶化的功能。修復，雖然這是一個有點冗長：

noMemoF :: (Integral n) => n -> n 
noMemoF = runIdentity . fix fm

然後，您可以使用狀態單子結合Data.Map加快速度：

import qualified Data.Map.Strict as MS 

withMemoStMap :: (Integral n) => n -> n 
withMemoStMap n = evalState (fm recF n) MS.empty 
    where 
     recF i = do 
     v <- MS.lookup i <$> get 
     case v of 
      Just v' -> return v' 
      Nothing -> do 
       v' <- fm recF i 
       modify $ MS.insert i v' 
       return v'

有了細微的變化，你可以用Data.HashMap代碼適應工作，而不是：

import qualified Data.HashMap.Strict as HMS 

withMemoStHMap :: (Integral n, Hashable n) => n -> n 
withMemoStHMap n = evalState (fm recF n) HMS.empty 
    where 
     recF i = do 
     v <- HMS.lookup i <$> get 
     case v of 
      Just v' -> return v' 
      Nothing -> do 
       v' <- fm recF i 
       modify $ HMS.insert i v' 
       return v'

而是持久數據結構，你也可以嘗試可變數據結構（如Data.HashTable）結合在ST單子：

import qualified Data.HashTable.ST.Linear as MHM 

withMemoMutMap :: (Integral n, Hashable n) => n -> n 
withMemoMutMap n = runST $ 
    do ht <- MHM.new 
     recF ht n 
    where 
     recF ht i = do 
     k <- MHM.lookup ht i 
     case k of 
      Just k' -> return k' 
      Nothing -> do 
       k' <- fm (recF ht) i 
       MHM.insert ht i k' 
       return k'

相比沒有任何記憶化的實現，這些實現可以讓你，鉅額的投入，來獲得，而不必等待幾秒鐘在微秒的成績。

以Criterion爲基準，我可以觀察到Data.HashMap的實現實際上比Data.Map和Data.HashTable的執行稍微好一點（大約20％），其定時非常相似。

我發現基準的結果有點令人驚訝。我最初的感覺是HashTable會超越HashMap的實現，因爲它是可變的。這個最後的實現中可能會隱藏一些性能缺陷。

來源

2015-05-16 08:48:22 Quentin

+1

GHC在圍繞不可變結構進行優化方面做得非常好。來自C的直覺並不總是平息。 – 2015-05-24 16:47:18

2

幾年後，我看着這個和實現有和一個簡單的方法以線性時間來memoize的這種使用zipWith一個輔助功能：

dilate :: Int -> [x] -> [x] 
dilate n xs = replicate n =<< xs

dilate有方便的屬性，dilate n xs !! i == xs !! div i n。

所以，假如我們給出F（0），這簡化了計算，以

fs = f0 : zipWith max [1..] (tail $ fs#/2 .+. fs#/3 .+. fs#/4) 
    where (.+.) = zipWith (+) 
     infixl 6 .+. 
     (#/) = flip dilate 
     infixl 7 #/

找了很多像我們原來的問題描述，並給出一個線性解決方案（sum $ take n fs將採取爲O（n））。

來源

2015-10-11 02:46:54 rampion

2

一個沒有索引的解決方案，不是基於Edward KMETT的。

我分解出到公共父共同子樹（f(n/4)是f(n/2)和f(n/4)之間共享，並且f(n/6)被f(2)和f(3)之間共享）。通過將它們保存爲父變量中的單個變量，子樹的計算只需執行一次。

data Tree a = 
    Node {datum :: a, child2 :: Tree a, child3 :: Tree a} 

f :: Int -> Int 
f n = datum root 
    where root = f' n Nothing Nothing 


-- Pass in the arg 
    -- and this node's lifted children (if any). 
f' :: Integral a => a -> Maybe (Tree a) -> Maybe (Tree a)-> a 
f' 0 _ _ = leaf 
    where leaf = Node 0 leaf leaf 
f' n m2 m3 = Node d c2 c3 
    where 
    d = if n < 12 then n 
      else max n (d2 + d3 + d4) 
    [n2,n3,n4,n6] = map (n `div`) [2,3,4,6] 
    [d2,d3,d4,d6] = map datum [c2,c3,c4,c6] 
    c2 = case m2 of -- Check for a passed-in subtree before recursing. 
     Just c2' -> c2' 
     Nothing -> f' n2 Nothing (Just c6) 
    c3 = case m3 of 
     Just c3' -> c3' 
     Nothing -> f' n3 (Just c6) Nothing 
    c4 = child2 c2 
    c6 = f' n6 Nothing Nothing 

    main = 
     print (f 123801) 
     -- Should print 248604.

代碼不容易擴展到一般的記憶化功能（至少，我不知道該怎麼做），你真的必須想出如何子問題重疊，但戰略應該適用於一般的多個非整數參數。（我認爲它適用於兩個字符串參數）。

備註在每次計算後都會被丟棄。（同樣，我在考慮兩個字符串參數。）

我不知道這是否比其他答案更有效。每個查詢在技術上只有一到兩個步驟（「看看你的孩子或你的孩子的孩子」），但可能會有很多額外的內存使用。

~~編輯：此解決方案尚未正確。分享不完整。~~

編輯：它應該現在正確地共享子女，但我意識到這個問題有很多不平凡的共享：n/2/2/2和n/3/3可能是相同的。這個問題不適合我的策略。

來源

2016-04-13 16:14:55 leewz

Haskell中的記憶？

回答

相關問題