2014-01-25 61 views
2

我想要手動獲取GIT提交的SHA1提交哈希,但某些工作不正常。C#計算GIT提交散列

首先我們有一個看起來像這樣的標準提交信息:

tree f594b3f6d9ae291c83902f3992aa36872aa70d68 

parent 0000004bf6d464667df5150b4526083886947d92 

author User <[email protected]> 1390620460.46263 +0000 
committer User <[email protected]> 1390620460.46263 +0000 

Commit Message 

我們稱之爲的「commitMessage」

的規範說得到承諾說哈希,我們必須SHA1:

  • 字符串 「提交」
  • 加一個空格 「」
  • 加字節在commitMessage
  • 數加一空字節
  • 加上commitMessage

所以(僞代碼OBV)

SHA1("commit" + " " + numBytes(commitMessage) + '\0' + commitMessage);

而且這是我在C# 實施(我知道這是相當笨拙的)

var commitBody = "tree " + treeHash + "\n\n" + 
        "parent " + parentHash + "\n\n" + 
        "author User <[email protected]> " + date + "\n" + 
        "committer User <[email protected]> " + date + "\n\n" + 
        "My Commit Message\n"; 

    var blob = "commit " + Encoding.UTF8.GetByteCount(commitBody); 

    // This is the string "commit " (with a space) + byte count 
    var first = Encoding.UTF8.GetBytes(blob); 

    // This is just the null byte 
    var second = new byte[1]; 
    second[0] = (byte)0; 

    // This is the commitMessage 
    var third = Encoding.UTF8.GetBytes(commitBody); 

    // Merge first, second, third into bytez as a byte array 
    var bytez = new byte[first.Length + second.Length + third.Length]; 
    Buffer.BlockCopy(first, 0, bytez, 0, first.Length); 
    Buffer.BlockCopy(second, 0, bytez, first.Length, second.Length); 
    Buffer.BlockCopy(third, 0, bytez, first.Length + second.Length, third.Length); 

    // Debug Print 
    Console.WriteLine(Encoding.UTF8.GetString(bytez)); 

    // Compute the hash and print it 
    var sss = SHA1.Create(); 
    var myssh = GetString(sss.ComputeHash(bytez)); 
    Console.WriteLine(myssh); 

返回的散列與從GIT返回的散列不同。我實際上並沒有期待任何人知道如何做到這一點,因爲這不是通常會做的事情,但我想我會問。

感謝您的幫助:d

回答

1

每個對象的hash實際上是哈希「長度+‘’+內容」 - 這部作品以防止SHA1散列衝突(因爲現在你有碰撞的SHA1 兩和的長度,這是不太可能的)

+0

雞蛋裏挑骨頭:我想你指的對象,不是一團糟,因爲這關於提交 – alternative

+0

已修復,你說得對! –

1

如果在字符串中使用UTF-8字符,請不要使用string.Length來保留字節數組。如果字符串只包含ASCII字符,這是正確的,但如果字符串中存在UTF-8字符,則.Length將小於實際的字節大小。

由於您正在使用.Length來分配數組,因此該數組可能會變小,而不是所有的字符串數據都可能被複制。

我建議你使用StringBuilder來建立你的字符串,然後用System.Text.Encoding.UTF8.GetBytes(stringbuilder.ToString())來獲取字節數據。

StringBuilder sb = new StringBuilder(); 
sb.Append("commit "+ Encoding.UTF8.GetByteCount(commitBody)); 
sb.Append("\0"); 
sb.Append(commitBody); 

var sss = SHA1.Create(); 
var bytez = Encoding.UTF8.GetBytes(sb.ToString()); 
var myssh = GetString(sss.ComputeHash(bytez)); 
Console.WriteLine(myssh); 
0

應該有treeparent線後沒有空行,即提交體應​​該是:

tree f594b3f6d9ae291c83902f3992aa36872aa70d68 
parent 0000004bf6d464667df5150b4526083886947d92 
author User <[email protected]> 1390620460.46263 +0000 
committer User <[email protected]> 1390620460.46263 +0000 

Commit Message 

查看原始C實現; commit_tree_extended()在commit.c

0

不是C#,但這裏是如何計算個混蛋從bash提示符提交哈希:

commit_len=$(git cat-file commit HEAD | wc -c) 
(echo -ne "commit $commit_len\0"; git cat-file commit HEAD) | sha1sum 

檢查哈希是正確的:

git show HEAD | grep commit