我想寫我自己的樸素貝葉斯分類器 我有一個這樣的文件:Clojure的頻率詞典
(這是垃圾郵件和火腿的消息,第一個字點的數據庫,以垃圾郵件或火腿,文本,直到EOLN是消息(尺寸:0.5 MB)從這裏http://www.dt.fee.unicamp.br/~tiago/smsspamcollection/)
ham Go until jurong point, crazy.. Available only in bugis n gre
at world la e buffet... Cine there got amore wat...
ham Ok lar... Joking wif u oni...
spam Free entry in 2 a wkly comp to win FA Cup final tkts 21st May 2005. Text FA to 87121 to receive entry question(std txt rate)T&C's apply 08452810075over18's
ham U dun say so early hor... U c already then say...
ham Nah I don't think he goes to usf, he lives around here though
spam FreeMsg Hey there darling it's been 3 week's now and no word back! I'd like some fun you up for it still? Tb ok! XxX std chgs to send, £1.50 to rcv
,我想做出這樣一個HashMap: { 「垃圾郵件」{ 「走出去」 1 「直到」 100,...} ,「火腿」{......}} 哈希圖,其中每個值都是單詞的頻率圖(對於火腿和垃圾郵件分開)
我知道,如何通過Python或C++做,我用Clojure的做到了,但我的解決方案失敗(計算器)在大型數據
我的解決辦法:
(defn read_data_from_file [fname]
(map #(split % #"\s")(map lower-case (with-open [rdr (reader fname)]
(doall (line-seq rdr))))))
(defn do-to-map [amap keyseq f]
(reduce #(assoc %1 %2 (f (%1 %2))) amap keyseq))
(defn dicts_from_data [raw_data]
(let [data (group-by #(first %) raw_data)]
(do-to-map
data (keys data)
(fn [x] (frequencies (reduce concat (map #(rest %) x)))))))
我tryed到找到它假的,寫這
(def raw_data (read_data_from_file (first args)))
(def d (group-by #(first %) raw_data))
(def f (map frequencies raw_data))
(def d1 (reduce concat (d "spam")))
(println (reduce concat (d "ham")))
錯誤:
Exception in thread "main" java.lang.RuntimeException: java.lang.StackOverflowError
at clojure.lang.Util.runtimeException(Util.java:165)
at clojure.lang.Compiler.eval(Compiler.java:6476)
at clojure.lang.Compiler.eval(Compiler.java:6455)
at clojure.lang.Compiler.eval(Compiler.java:6431)
at clojure.core$eval.invoke(core.clj:2795)
at clojure.main$eval_opt.invoke(main.clj:296)
at clojure.main$initialize.invoke(main.clj:315)
.....
任何人都可以幫助我做到這一點更好/有效? PS抱歉我寫錯了。英語不是我的母語。
( - > F1 F2)等同放着清單(F1(F2數據))? –
實際上它等同於'(f2(f1 data))',表格是從左到右應用的。欲瞭解更多信息,請查看Fogus的[this](http://blog.fogus.me/2009/09/04/understanding-the-clojure-macro/)。你也可以找到一些線程宏的例子 - >'和' - >>',[here](http://clojuredocs.org/clojure_core/clojure.core/-%3E)和[here](http: //clojuredocs.org/clojure_core/clojure.core/-%3E%3E)。 –
我誤會了我的第一條評論。謝謝! –