在Clojure中分區分區？

以下是一些值。每個都是一系列的升序（或其他分組）值。在Clojure中分區分區？

(def input-vals [[[1 :a] [1 :b] [2 :c] [3 :d] [3 :e]] 
      [[1 :f] [2 :g] [2 :h] [2 :i] [3 :j] [3 :k]] 
      [[1 :l] [3 :m]]])

我可以按價值分割它們。

=> (map (partial partition-by first) input-vals) 
    ((([1 :a] [1 :b]) ([2 :c]) ([3 :d] [3 :e])) (([1 :f]) ([2 :g] [2 :h] [2 :i]) ([3 :j] [3 :k])) (([1 :l]) ([3 :m])))

但是，我得到了3個分區序列。我想要一個單獨的分區組序列。

我想要做的是返回單個惰性序列（可能）是連接的各個分區的惰性序列。例如我想產生這樣：

((([1 :a] [1 :b] [1 :f] [1 :l]) ([2 :c] [2 :g] [2 :h] [2 :i]) ([3 :d] [3 :e] [3 :j] [3 :k] [3 :m])))

請注意，並非所有的值出現在所有序列（有第三矢量沒有2）。

這當然是我的問題的簡化。真實的數據是來自非常大的文件的一組惰性流，所以什麼都不能實現。但我認爲上述問題的解決方案是解決我的問題。

隨意編輯標題，我不太確定如何表達它。

來源

2014-01-21 Joe

您是否知道您改變了多少問題？：p – Chiron

我改變了內容，但沒有改變我正在尋找的功能的應用。 – Joe

感謝您的耐心人。我試圖通過使用簡單的值來儘可能簡化問題。（也造成了repl中的拼寫錯誤）。我試圖實現的目標沒有改變，但凱龍的答案使用了身份，這意味着我必須證明用於分區的投影的價值（在本例中爲「第一」）具有共同的價值，但價值他們自己（'[1：a]'）是相互獨特的。 – Joe

試試這個恐怖：

(defn partition-many-by [f comp-f s] 
    (let [sorted-s (sort-by first comp-f s) 
     first-list (first (drop-while (complement seq) sorted-s)) 
     match-val (f (first first-list)) 
     remains (filter #(not (empty? %)) 
         (map #(drop-while (fn [ss] (= match-val (f ss))) %) 
          sorted-s))] 
    (when match-val 
     (cons 
     (apply concat 
      (map #(take-while (fn [ss] (= match-val (f ss))) %) 
       sorted-s)) 
     (lazy-seq (partition-many-by f comp-f remains))))))

它也可能會被提高，以除去雙重值檢查（take-while和drop-while）。

用法示例：

(partition-many-by identity [[1 1 1 1 2 2 3 3 3 3] [1 1 2 2 2 2 3] [3]]) 

=> ((1 1 1 1 1 1) (2 2 2 2 2 2) (3 3 3 3 3 3))

來源

2014-01-21 17:41:44

非常感謝Karl Jonathan Ward。 – Joe

這不是_quite_ work，例如'（partition-many-by-identity [[0 2 4 6 8 10] [0 3 6 9 12] [0 5 10 15]]）=>（（0 0 0）（2）（4）（6）（8）（10）（3）（6）（9）（12）（5）（10）（15）），而是應該將兩個6和兩個10合在一起。 –

對。但我想這取決於確切的需要 - 分組更重要還是隔離？我喜歡你的想法，下面是一個lazy-merge-by，但是這確實需要分區元素是可訂購的以及可分隔的。例如那麼情況如何： –

我不知道我是否以下，但你可以faltten結果順序是這樣的：

(flatten (partition-by identity (first input-vals)))

clojure.core /扁平
（[X]）
接受任何順序事物（列表，向量，
等）的嵌套組合，並將其內容作爲單個平坦序列返回。
（flatten nil）返回一個空序列。

你可以使用實現嗎？函數來測試一個序列是否懶惰。

來源

2014-01-21 16:30:36 Chiron

這幾乎是我想要的。我會澄清我的問題（我認爲你的答案仍然適用）。但它是否懶惰？ – Joe

@Joe我編輯了我的帖子 – Chiron

平展會消除輸入的所有內部結構，只留下一個平坦的數字和關鍵字序列。 – noisesmith

user> (def desired-result '((([1 :a] [1 :b] [1 :f] [1 :l]) 
          ([2 :c] [2 :g] [2 :h] [2 :i]) 
          ([3 :d] [3 :e] [3 :j] [3 :k] [3 :m])))) 
#'user/desired-result 

user> (def input-vals [[[1 :a] [1 :b] [2 :c] [3 :d] [3 :e]] 
         [[1 :f] [2 :g] [2 :h] [2 :i] [3 :j] [3 :k]] 
         [[1 :l] [3 :m]]]) 
#'user/input-vals 

user> (= desired-result (vector (vals (group-by first (apply concat input-vals))))) 
true

我改變了輸入丘壑稍微糾正什麼，我認爲是一個拼寫錯誤，如果它是不是一個錯誤，我可以更新我的代碼，以適應結構比較鬆散。

使用->>（線程最後）宏，我們可以有相當的代碼更可讀的形式：

user> (= desired-result 
     (->> input-vals 
      (apply concat) 
      (group-by first) 
      vals 
      vector)) 
true

來源

2014-01-21 17:11:33 noisesmith

謝謝，看起來很有希望。我會試一試。 – Joe

備註：你不能指望所有的項目都會被正確分組，並且也會達到懶惰。如果你認真思考，這兩個願望是矛盾的（除非你對未來投入的結構具有一些先驗知識，並圍繞該結構構建你的代碼）。 – noisesmith

除非我誤解了你，'partition-by'確實不是嗎？正如我在頂部所說的，輸入是在它們的流中排序的。 – Joe

讓我們把這個有趣和使用無限長度的序列爲我們輸入

(def twos (iterate #(+ 2 %) 0)) 
(def threes (iterate #(+ 3 %) 0)) 
(def fives (iterate #(+ 5 %) 0))

我們需要懶洋洋地合併。我們要求一個比較器，以便我們也可以應用於其他數據類型。

(defn lazy-merge-by 
([compfn xs ys] 
    (lazy-seq 
    (cond 
     (empty? xs) ys 
     (empty? ys) xs 
     :else (if (compfn (first xs) (first ys)) 
       (cons (first xs) (lazy-merge-by compfn (rest xs) ys)) 
       (cons (first ys) (lazy-merge-by compfn xs (rest ys))))))) 
    ([compfn xs ys & more] 
    (apply lazy-merge-by compfn (lazy-merge-by compfn xs ys) more)))

測試

(take 15 (lazy-merge-by < twos threes fives)) 
;=> (0 0 0 2 3 4 5 6 6 8 9 10 10 12 12)

我們可以（懶惰）由值分區如果需要

(take 10 (partition-by identity (lazy-merge-by < twos threes fives))) 
;=> ((0 0 0) (2) (3) (4) (5) (6 6) (8) (9) (10 10) (12 12))

現在，回到樣品輸入

(partition-by first (apply lazy-merge-by #(<= (first %) (first %2)) input-vals)) 
;=> (([1 :a] [1 :b] [1 :f] [1 :l]) ([2 :c] [2 :g] [2 :h] [2 :i]) ([3 :d] [3 :e] [3 :j] [3 :k] [3 :m]))

根據需要少一個多餘的一組外部括號。

來源

2014-01-21 18:38:10

非常感謝，我會嘗試。 – Joe

(partition-by first (sort-by first (mapcat identity input-vals)))

來源

2014-01-22 00:11:21 Hendekagon

謝謝，但我不認爲'排序'是懶惰的。 – Joe

yeh我想我不明白這個問題 - 如果你想懶惰地分組，你怎麼能確定你已經看到了給定組的所有項目，如果他們中的一些可能在（你的順序的其餘部分我還沒有意識到？ – Hendekagon

因爲我知道輸入文件中的項目是按順序分組的，在這種情況下，它們是按照日期順序遞增的，而且是按日期分區的 – Joe

在Clojure中分區分區？

回答

相關問題