2010-04-30 104 views
2

我有兩個版本的文本,我想生成一個類似於Google文檔或堆棧溢出顯示的版本的HTML視圖。我需要在Python中執行此操作。我不知道這種技術被稱爲什麼,但我認爲它有一個名稱,並希望有一個可以做到這一點的Python庫。在Python中生成文本的修訂歷史記錄

版本1:

威廉·亨利 「比爾」 蓋茨(生於 1955年10月28日)[2]是美國 商業巨頭,慈善家和 董事長[3]微軟時,軟件 他與保羅艾倫創立的公司。

版本2:

威廉·亨利 「比爾」 蓋茨(生於 1955年10月28日)[2]是一種商業 巨頭,慈善家和 董事長[3]微軟,他與保羅艾倫創立的公司軟件 。 他是美國人。

所需的輸出:

威廉·亨利 「比爾」 蓋茨(生於1955年 10月28日)[2]是 自北美洲 業務 巨頭,慈善家和 董事長[ 3]的微軟,他與保羅艾倫創立的軟件 。 他是美國人。

使用diff命令不起作用,因爲它告訴我,這行是不同的,但不是這列/字不同。

$ echo 'William Henry "Bill" Gates III (born October 28, 1955)[2] is an American business magnate, philanthropist, and chairman[3] of Microsoft, the software company he founded with Paul Allen.' > oldfile 
$ echo 'William Henry "Bill" Gates III (born October 28, 1955)[2] is a business magnate, philanthropist, and chairman[3] of Microsoft, the software company he founded with Paul Allen. He is American.' > newfile 
$ diff -u oldfile newfile 
--- oldfile 2010-04-30 13:32:43.000000000 -0700 
+++ newfile 2010-04-30 13:33:09.000000000 -0700 
@@ -1 +1 @@ 
-William Henry "Bill" Gates III (born October 28, 1955)[2] is an American business magnate, philanthropist, and chairman[3] of Microsoft, the software company he founded with Paul Allen. 
+William Henry "Bill" Gates III (born October 28, 1955)[2] is a business magnate, philanthropist, and chairman[3] of Microsoft, the software company he founded with Paul Allen. He is American.' > oldfile 

回答

0

您可以使用wdiff。我不知道是否有一個Python實現:

$ wdiff oldfile newfile 
William Henry "Bill" Gates III (born October 28, 1955)[2] is [-an American-] {+a+} business magnate, philanthropist, and chairman[3] of Microsoft, the software company he founded with Paul Allen. {+He is American.+}