2014-01-17 48 views
1

我目前使用re-seq在一段java源代碼中查找註釋匹配。從re-seq結果中獲取字符串索引

(re-seq #"(?:/\*(?:[^*]|(?:\*+[^*/]))*\*+/)|(?://.*)" code) 

如何獲取原始字符串code中匹配的索引/索引?即查找原始字符串code的開始(和結束)點。

回答

2

可以修改re-seq具有必要的Java互操作:

(defn re-seq-pos [pattern string] 
    (let [m (re-matcher pattern string)] 
    ((fn step [] 
     (when (. m find) 
     (cons {:start (. m start) :end (. m end) :group (. m group)} 
      (lazy-seq (step)))))))) 

(re-seq-pos #"\w+" "foo bar baz") ;=> 

({:start 0, :end 3, :group "foo"} 
{:start 4, :end 7, :group "bar"} 
{:start 8, :end 11, :group "baz"}) 
+0

這是出於說明性目的的部分解決方案。它不像「re-seq」那樣下降到分組中。您需要修改(如上所述)'re-groups'版本,而不是從're-seq'副本中調用。 –