巴什 - 差的兩個表之間

好吧，我有我的Linux系統中兩個相關的列表中的文本文件：巴什 - 差的兩個表之間

/tmp/oldList 
/tmp/newList

我需要比較這些列表，看看有什麼得到了增加線路和去除得到什麼線路。然後，我需要遍歷這些行並根據它們是否被添加或刪除來對它們執行操作。我如何在bash中做到這一點？

來源

2012-06-22 exvance

4天前提出了同樣的問題http://stackoverflow.com/questions/11099894/comparing-2-unsorted-lists-in-linux-listing-the-unique-in-the-second-file/11101143 ＃11101143 –

使用comm(1)命令來比較兩個文件。它們都需要進行排序，如果它們很大，您可以事先進行排序，或者您可以使用bash 進程替換進行內聯。

comm可以採取標誌-1，-2和-3指示哪些文件以抑制從線（獨特到文件1，獨特到文件2或常見的兩種）的組合。

爲了得到線只在舊文件：

comm -23 <(sort /tmp/oldList) <(sort /tmp/newList)

爲了得到線僅在新文件：

comm -13 <(sort /tmp/oldList) <(sort /tmp/newList)

你可以喂說成while read循環來處理每行：

while read old ; do 
    ...do stuff with $old 
done < <(comm -23 <(sort /tmp/oldList) <(sort /tmp/newList))

以及對於新行類似。

來源

2012-06-23 00:08:40 camh

diff command會爲你做比較。

例如，

$ diff /tmp/oldList /tmp/newList

更多信息請參見上述手冊頁的鏈接。這應該照顧你的問題的第一部分。

來源

2012-06-22 22:58:51 Levon

我只強調''diff'命令有很多選項可以用來格式化輸出，這可以爲處理差異的程序提供方便的輸入。 – chepner

@chepner好點..它是絕對值得檢查鏈接手冊頁。 – Levon

你試過diff

$ diff /tmp/oldList /tmp/newList 

$ man diff

來源

2012-06-22 22:58:52 ssedano

如果您的腳本需要可讀性，請考慮使用Ruby。

爲了得到線只在舊文件：

ruby -e "puts File.readlines('/tmp/oldList') - File.readlines('/tmp/newList')"

爲了得到線僅在新文件：

ruby -e "puts File.readlines('/tmp/newList') - File.readlines('/tmp/oldList')"

你可以喂到這一段時間讀循環來處理每線：

while read old ; do 
    ...do stuff with $old 
done < ruby -e "puts File.readlines('/tmp/oldList') - File.readlines('/tmp/newList')"

來源

2013-11-07 00:10:09 Nowaker

這是舊的，但爲了完整性，我們應該說，如果你有一個非常大的集合，fastes噸的解決辦法是使用diff來產生腳本，然後源，就像這樣：

#!/bin/bash 

line_added() { 
    # code to be run for all lines added 
    # $* is the line 
} 

line_removed() { 
    # code to be run for all lines removed 
    # $* is the line 
} 

line_same() { 
    # code to be run for all lines at are the same 
    # $* is the line 
} 

cat /tmp/oldList | sort >/tmp/oldList.sorted 
cat /tmp/newList | sort >/tmp/newList.sorted 

diff >/tmp/diff_script.sh \ 
    --new-line-format="line_added %L" \ 
    --old-line-format="line_removed %L" \ 
    --unchanged-line-format="line_same %L" \ 
    /tmp/oldList.sorted /tmp/newList.sorted 

source /tmp/diff_script.sh

線改變將顯示爲刪除，並添加。如果你不喜歡這個，你可以使用--changed-group-format。檢查diff手冊頁面。

來源

2015-02-03 19:57:39

巴什 - 差的兩個表之間

回答

相關問題