我有一個很多行的文本,我的問題是如何刪除emacs中的重複行?在emacs或elisp軟件包中使用該命令,而無需外部utils。如何刪除emacs中的重複行
例如:
this is line a
this is line b
this is line a
刪除
this is line a
this is line b
我有一個很多行的文本,我的問題是如何刪除emacs中的重複行?在emacs或elisp軟件包中使用該命令,而無需外部utils。如何刪除emacs中的重複行
例如:
this is line a
this is line b
this is line a
刪除
this is line a
this is line b
將這個代碼到你的.emacs 3號線(同第1線):
(defun uniq-lines (beg end)
"Unique lines in region.
Called from a program, there are two arguments:
BEG and END (region to sort)."
(interactive "r")
(save-excursion
(save-restriction
(narrow-to-region beg end)
(goto-char (point-min))
(while (not (eobp))
(kill-line 1)
(yank)
(let ((next-line (point)))
(while
(re-search-forward
(format "^%s" (regexp-quote (car kill-ring))) nil t)
(replace-match "" nil nil))
(goto-char next-line))))))
用法:
M-x uniq-lines
(defun unique-lines (start end)
"This will remove all duplicating lines in the region.
Note empty lines count as duplicates of the empy line! All empy lines are
removed sans the first one, which may be confusing!"
(interactive "r")
(let ((hash (make-hash-table :test #'equal)) (i -1))
(dolist (s (split-string (buffer-substring-no-properties start end) "$" t)
(let ((lines (make-vector (1+ i) nil)))
(maphash
(lambda (key value) (setf (aref lines value) key))
hash)
(kill-region start end)
(insert (mapconcat #'identity lines "\n"))))
(setq s ; because Emacs can't properly
; split lines :/
(substring
s (position-if
(lambda (x)
(not (or (char-equal ?\n x) (char-equal ?\r x)))) s)))
(unless (gethash s hash)
(setf (gethash s hash) (incf i))))))
一種替代方案:
\n
(UNIX樣式)一致。根據您的情況,這可能是獎金或劣勢。split-string
以使其接受字符而不是正則表達式,您可以使它更好一點(更快)。稍長,但是,也許,更多的有效的變體:
(defun split-string-chars (string chars &optional omit-nulls)
(let ((separators (make-hash-table))
(last 0)
current
result)
(dolist (c chars) (setf (gethash c separators) t))
(dotimes (i (length string)
(progn
(when (< last i)
(push (substring string last i) result))
(reverse result)))
(setq current (aref string i))
(when (gethash current separators)
(when (or (and (not omit-nulls) (= (1+ last) i))
(/= last i))
(push (substring string last i) result))
(setq last (1+ i))))))
(defun unique-lines (start end)
"This will remove all duplicating lines in the region.
Note empty lines count as duplicates of the empy line! All empy lines are
removed sans the first one, which may be confusing!"
(interactive "r")
(let ((hash (make-hash-table :test #'equal)) (i -1))
(dolist (s (split-string-chars
(buffer-substring-no-properties start end) '(?\n) t)
(let ((lines (make-vector (1+ i) nil)))
(maphash
(lambda (key value) (setf (aref lines value) key))
hash)
(kill-region start end)
(insert (mapconcat #'identity lines "\n"))))
(unless (gethash s hash)
(setf (gethash s hash) (incf i))))))
Emacs緩衝區中的行始終由\ n分隔(無論在相應文件中使用何種分隔符)。 \ r僅用於舊的'selective-display',而這個''selective-display''已經在很多年前被覆蓋和文本屬性的'seevisible'屬性廢棄了。 – Stefan
如果你有Emacs 24.4或更新的版本,最簡單的方法是使用新的delete-duplicate-lines
函數。需要注意的是
例如,如果你的輸入
test
dup
dup
one
two
one
three
one
test
five
M-x delete-duplicate-lines
將使
test
dup
one
two
three
five
您可以選擇使用通用參數(C-u
)作爲前綴搜索。結果將是
dup
two
three
one
test
five
信貸去emacsredux.com。
其他迂迴選項,不給完全一樣的結果,可以通過ESHELL:
sort -u
;不保持原件的相對順序uniq
;更糟糕的是,它需要將其輸入進行排序'sort -u'可能不是一個穩定的排序,但'sort -u -s'是 – Squidly
是的,的確如此。現在修好!從eshell運行它似乎是一個不太乾淨的解決方案,使用內置功能。 – legends2k
@Squid我想我沒有正確驗證你的最後評論。嘗試將輸入數據提供給'sort -u'和'sort -us',你會得到與delete-duplicate-lines不同的結果。更重要的是,我們並不是在談論穩定排序,這意味着維持相同元素的相對順序。由於我們刪除了重複內容,因此無論如何都丟失了相同的元素。 'delete-duplicate-lines'保持原件的順序不重複;所以用'sort'不能得到相同的結果。 – legends2k
而不是使用kill-ring,您可以將內容保存在'let'綁定變量中。 –
感謝您的諮詢! – ymn
感謝您的幫助! – toolchainX