在Python中使用正則表達式分割一個diff文件

我想在python中使用re模塊將每個區段分成不同的區域（統一格式）。差異的格式是這樣的...在Python中使用正則表達式分割一個diff文件

diff --git a/src/core.js b/src/core.js 
index 9c8314c..4242903 100644 
--- a/src/core.js 
+++ b/src/core.js 
@@ -801,7 +801,7 @@ jQuery.extend({ 
     return proxy; 
    }, 

- // Mutifunctional method to get and set values to a collection 
+ // Multifunctional method to get and set values of a collection 
    // The value/s can optionally be executed if it's a function 
    access: function(elems, fn, key, value, chainable, emptyGet, pass) { 
     var exec, 
diff --git a/src/sizzle b/src/sizzle 
index fe2f618..feebbd7 160000 
--- a/src/sizzle 
+++ b/src/sizzle 
@@ -1 +1 @@ 
-Subproject commit fe2f618106bb76857b229113d6d11653707d0b22 
+Subproject commit feebbd7e053bff426444c7b348c776c99c7490ee 
diff --git a/test/unit/manipulation.js b/test/unit/manipulation.js 
index 18e1b8d..ff31c4d 100644 
--- a/test/unit/manipulation.js 
+++ b/test/unit/manipulation.js 
@@ -7,7 +7,7 @@ var bareObj = function(value) { return value; }; 
var functionReturningObj = function(value) { return (function() { return value; }); }; 

test("text()", function() { 
- expect(4); 
+ expect(5); 
    var expected = "This link has class=\"blog\": Simon Willison's Weblog"; 
    equal(jQuery("#sap").text(), expected, "Check for merged text of more then one element."); 

@@ -20,6 +20,10 @@ test("text()", function() { 
     frag.appendChild(document.createTextNode("foo")); 

    equal(jQuery(frag).text(), "foo", "Document Fragment Text node was retreived from .text()."); 
+ 
+ var $newLineTest = jQuery("<div>test<br/>testy</div>").appendTo("#moretests"); 
+ $newLineTest.find("br").replaceWith("\n"); 
+ equal($newLineTest.text(), "test\ntesty", "text() does not remove new lines (#11153)"); 
}); 

test("text(undefined)", function() { 
diff --git a/version.txt b/version.txt 
index 0a182f2..0330b0e 100644 
--- a/version.txt 
+++ b/version.txt 
@@ -1 +1 @@ 
-1.7.2 
\ No newline at end of file 
+1.7.3pre 
\ No newline at end of file

我試過下列組合模式，但不能完全正確地得到它。這是迄今爲止我來最接近...

re.compile(r'(diff.*?[^\rdiff])', flags=re.S|re.M)

但這產生

['diff ', 'diff ', 'diff ', 'diff ']

我怎麼會匹配所有部分在這個差異？

來源

2012-05-06 Kevin

該做的：

r=re.compile(r'^(diff.*?)(?=^diff|\Z)', re.M | re.S) 
for m in re.findall(r, s): 
    print '====' 
    print m

來源

2012-05-06 17:00:56

是的，這對我來說非常合適，謝謝。 – Kevin

你爲什麼使用正則表達式？如果迭代線並開始一個新的部分，當一行開始diff？

list_of_diffs = [] 
temp_diff = '' 
for line in patch: 
    if line.startswith('diff'): 
     list_of_diffs.append(temp_diff) 
     temp_diff = '' 
    else: temp_diff.append(line)

聲明，上面的代碼只應該被認爲是說明性的僞代碼，並且預計不會實際運行。

正則表達式是一個錘子，但你的問題不是釘子。

來源

2012-05-06 16:48:06 JosefAssad

你並不需要使用正則表達式，只是拆分文件：

diff_file = open('diff.txt', 'r') 
diff_str = diff_file.read() 
diff_split = ['diff --git%s' % x for x in diff_str.split('diff --git') \ 
       if x.strip()] 
print diff_split

來源

2012-05-06 16:55:36 mVChr

就拆對，再接一個字diff任何換行符：

result = re.split(r"\n(?=diff\b)", subject)

儘管爲了安全起見，您可能應該嘗試匹配\r或\r\n以及：

result = re.split(r"(?:\r\n|[\r\n])(?=diff\b)", subject)

來源

2012-05-06 17:17:12

在Python中使用正則表達式分割一個diff文件

回答

相關問題