2016-01-24 25 views
1

解析結果在下面的鏈接 https://github.com/swannodette/enlive-tutorial/blob/master/src/tutorial/scrape1.clj如何從HttpClient的在enlive

它顯示瞭如何從URL解析的頁面,但我需要使用SOCK5代理,我想不出如何使用代理裏面enlive,但我知道如何在HttpClient的使用代理,但如何解析從HttpClient的結果,我有以下的代碼,但最後一行顯示空結果

(:require [clojure.set :as set] 
       [clj-http.client :as client] 
       [clj-http.conn-mgr :as conn-mgr] 
       [clj-time.core :as time] 
       [jsoup.soup :as soup] 
       [clj-time.coerce :as tc] 
       [net.cgrand.enlive-html :as html] 
       )  
    (def a (client/get "https://news.ycombinator.com/" 
          {:connection-manager (conn-mgr/make-socks-proxied-conn-manager "127.0.0.1" 9150) 
           :socket-timeout 10000 :conn-timeout 10000 
           :client-params {"http.useragent" "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/534.20 (KHTML, like Gecko) Chrome/11.0.672.2 Safari/534.20"}})) 
(def b (html/html-resource a)) 
(html/select b [:td.title :a]) 

回答

1

當使用enlive的html-resource FN從URL執行提取,然後將其轉換爲可解析的數據結構。看起來,當你傳遞一個已經完成的請求時,它只是返回請求而不是拋出錯誤。

無論哪種方式,你想要的功能是html-snippet,你會想要通過它的請求正文。像這樣:

;; Does not matter if you are using a connection manager or not as long as 
;; its returning a result with a body 
(def req (client/get "https://news.ycombinator.com/")) 

(def body (:body req)) 
(def nodes (html/html-snippet body)) 
(html/select nodes [:td.title :a]) 

;; Or you can put it all together like this 

(-> req 
    :body 
    html/html-snippet 
    (html/select [:td.title :a])))