解析製表符分隔字符串

我有一些麻煩搞清楚如何被製表符分隔成數據塊作爲一個例子，如果我有我從閱讀的文本文件看起來像這樣解析製表符分隔字符串

字符串分隔

a1  b1  c1  d1  e1 
a2  b2  c2  d2  e2

和我讀我的文件的第一行，並得到其中的

"a1  b1  c1  d1  e2"

我想這個分成5個變量a，b，c，d和e或創建一個列表的字符串（ABCDE ）。有什麼想法嗎？

謝謝。

來源

2012-05-10 Lpaulson

請告訴我們你迄今爲止寫的東西。 –

我還沒有寫任何東西，我原來的代碼是用Perl編寫的，我需要將它轉換爲lisp，我真的不知道最好的辦法是什麼。我開始認爲它可能更容易，而不是閱讀一個txt文件只是爲了納入到程序中，然後根據需要進行更改 – Lpaulson

嘗試將括號連接到輸入字符串的正面和背面，然後使用read-from-string（我假設您使用Common Lisp，因爲您標記了問題clisp）。

(setf str "a1 b1  c1  d1  e2") 
(print (read-from-string (concatenate 'string "(" str ")")))

來源

2012-05-10 00:26:48

另一種方式去了解它（稍微更穩健，也許），你也可以很容易地修改它，這樣你可以`SETF」字符串一旦回調的性質是所謂的，但我沒有不這樣做，因爲它似乎不需要這種能力。另外，在後面的例子中，我寧願使用宏。

(defun mapc-words (function vector 
        &aux (whites '(#\Space #\Tab #\Newline #\Rubout))) 
    "Iterates over string `vector' and calls the `function' 
with the non-white characters collected so far. 
The white characters are, by default: #\Space, #\Tab 
#\Newline and #\Rubout. 
`mapc-words' will short-circuit when `function' returns false." 
    (do ((i 0 (1+ i)) 
     (start 0) 
     (len 0)) 
     ((= i (1+ (length vector)))) 
    (if (or (= i (length vector)) (find (aref vector i) whites)) 
     (if (> len 0) 
      (if (not (funcall function (subseq vector start i))) 
       (return-from map-words) 
       (setf len 0 start (1+ i))) 
      (incf start)) 
     (incf len))) vector) 

(mapc-words 
#'(lambda (word) 
    (not 
     (format t "word collected: ~s~&" word))) 
"a1  b1  c1  d1  e1 
a2  b2  c2  d2  e2") 

;; word collected: "a1" 
;; word collected: "b1" 
;; word collected: "c1" 
;; word collected: "d1" 
;; word collected: "e1" 
;; word collected: "a2" 
;; word collected: "b2" 
;; word collected: "c2" 
;; word collected: "d2" 
;; word collected: "e2"

下面是一個示例宏，您可以使用，如果你想修改當你讀它的字符串，但我不與它完全滿意，所以也許有人會想出一個更好的變種。

(defmacro with-words-in-string 
    ((word start end 
      &aux (whites '(#\Space #\Tab #\Newline #\Rubout))) 
    s 
    &body body) 
    `(do ((,end 0 (1+ ,end)) 
     (,start 0) 
     (,word) 
     (len 0)) 
     ((= ,end (1+ (length ,s)))) 
    (if (or (= ,end (length ,s)) (find (aref ,s ,end) ',whites)) 
     (if (> len 0) 
      (progn 
       (setf ,word (subseq ,s ,start ,end)) 
       ,@body 
       (setf len 0 ,start (1+ ,end))) 
      (incf ,start)) 
     (incf len)))) 

(with-words-in-string (word start end) 
    "a1  b1  c1  d1  e1 
a2  b2  c2  d2  e2" 
(format t "word: ~s, start: ~s, end: ~s~&" word start end))

來源

2012-05-10 07:29:57

我對MAP-WORDS的設計方面並不滿意。我不會放入退出功能。 CL庫不使用它，並且可以通過提供的函數本身（使用任何CL機制：返回，投擲，條件信號等）完成從映射退出。 CL也不使用名稱'回調'。這意味着稍微不同的事件（事件驅動或訪問導向）。在CL標準中使用'功能'。 'x'應該是'vector'。 'map-words'應該是'mapc-words'並返回向量。 –

（remove-if＃'consp some-list：count 10），REMOVE-IF也有一個KEY參數。 –

假設他們是標籤（未分開），那麼這將創建一個列表

(defun tokenize-tabbed-line (line) 
    (loop 
    for start = 0 then (+ space 1) 
    for space = (position #\Tab line :start start) 
    for token = (subseq line start space) 
    collect token until (not space)))

其結果如下：

CL-USER> (tokenize-tabbed-line "a1 b1 c1 d1 e1") 
("a1" "b1" "c1" "d1" "e1")

來源

2014-10-22 19:57:00 beoliver

解析製表符分隔字符串

回答

相關問題