找出真正的文件類型

然後由用戶來正確識別他們的文件類型，這似乎是合理的。添加大量代碼（這些代碼也需要進行測試），以便仔細檢查用戶似乎是一大步。如果我說這是一個.pdf2文件，你會將它重命名爲.pdf？如果這是在公司環境中，那麼期望用戶對其文件具有正確的擴展名是合理的。我會跟蹤誰上傳了什麼。如果它是公開的，那麼掃描文件類型可能是值得的，但我絕對也會進行病毒掃描。

來源

2009-01-16 16:42:25 jcollum

換句話說，如果trojan.exe被重命名爲harmless.pdf並上傳，應用程序必須能夠發現上傳的文件不是.PDF文件。

這不是一個真正的問題。如果.exe作爲.pdf上傳，並且您正確地將其作爲application/pdf備份到下載程序，則所有下載程序將獲得的將是一個損壞的PDF。他們必須手動將其重新輸入到.exe才能受到傷害。

真正的問題是：

一些瀏覽器可以嗅出文件的內容，並決定他們比你知道它是什麼類型的文件更好。 IE在這方面特別糟糕，如果它看到任何HTML標籤潛伏在文件的開頭附近，傾向於將文件呈現爲HTML。這特別沒有幫助，因爲它意味着腳本可以被注入到您的網站，可能會危及任何應用程序級別的安全（cookie竊取等）。解決方法包括始終使用Content-Disposition將文件作爲附件提供服務，和/或提供來自不同主機名的文件，因此無法跨站點腳本返回到主站點。
無論如何PDF文件並不安全！他們可以充滿腳本，並且有嚴重的安全漏洞。在PDF閱讀器瀏覽器插件中利用漏洞是當前在Web上安裝特洛伊木馬的最常見手段之一。而且幾乎沒有任何事情可以用來檢測漏洞，因爲它們可能會被高度模糊處理。

來源

2009-01-16 17:06:27 bobince

在** * NIX系統，我們有一個名爲文件（1）的效用。嘗試爲Windows找到類似的東西，但是文件實用程序（如果自己已經移植過）。

來源

2009-01-16 17:08:14 daniel

下面的C++代碼可以幫助你：

//-1 : File Does not Exist or no access 
//0 : not an office document 
//1 : (General) MS office 2007 
//2 : (General) MS office older than 2007 
//3 : MS office 2003 PowerPoint presentation 
//4 : MS office 2003 Excel spreadsheet 
//5 : MS office applications or others 
int IsOffice2007OrOlder(wchar_t * fileName) 
{ 
    int iRet = 0; 
    byte msgFormatChk2007[8] = {0x50, 0x4B, 0x03, 0x04, 0x14, 0x00, 0x06, 0x00};  //offset 0 for office 2007 documents 
    byte possibleMSOldOffice[8] = {0xD0, 0xCF, 0x11, 0xE0, 0xA1, 0xB1, 0x1A, 0xE1};  //offset 0 for possible office 2003 documents 

    byte msgFormatChkXLSPPT[4] = {0xFD, 0xFF, 0xFF, 0xFF};  // offset 512: xls, ppt: FD FF FF FF 
    byte msgFormatChkOnlyPPT[4] = {0x00, 0x6E, 0x1E, 0xF0};  // offset 512: another ppt offset PPT 
    byte msgFormatChkOnlyDOC[4] = {0xEC, 0xA5, 0xC1, 0x00};  //offset 512: EC A5 C1 00 
    byte msgFormatChkOnlyXLS[8] = {0x09, 0x08, 0x10, 0x00, 0x00, 0x06, 0x05, 0x00};  //offset 512: XLS 

    int iMsgChk = 0; 
    HANDLE fileHandle = CreateFile(fileName, GENERIC_READ, 
     FILE_SHARE_READ, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_READONLY, NULL ); 
    if(INVALID_HANDLE_VALUE == fileHandle) 
    { 
     return -1; 
    } 

    byte buff[20]; 
    DWORD bytesRead; 
    iMsgChk = 1; 
    if(0 == ReadFile(fileHandle, buff, 8, &bytesRead, NULL)) 
    { 
     return -1; 
    } 

    if(buff[0] == msgFormatChk2007[0]) 
    { 
     while(buff[iMsgChk] == msgFormatChk2007[iMsgChk] && iMsgChk < 9) 
      iMsgChk++; 

     if(iMsgChk >= 8) { 
      iRet = 1; //office 2007 file format 
     } 
    } 
    else if(buff[0] == possibleMSOldOffice[0]) 
    { 
     while(buff[iMsgChk] == possibleMSOldOffice[iMsgChk] && iMsgChk < 9) 
      iMsgChk++; 

     if(iMsgChk >= 8) 
     { 
      //old office file format, check 512 offset further in order to filter out real office format 
      iMsgChk = 1; 
      SetFilePointer(fileHandle, 512, NULL, FILE_BEGIN); 
      if(ReadFile(fileHandle, buff, 8, &bytesRead, NULL) == 0) { return 0; } 

      if(buff[0] == msgFormatChkXLSPPT[0]) 
      { 
       while(buff[iMsgChk] == msgFormatChkXLSPPT[iMsgChk] && iMsgChk < 5) 
        iMsgChk++; 

       if(iMsgChk == 4) 
        iRet = 2; 
      } 
      else if(buff[iMsgChk] == msgFormatChkOnlyDOC[iMsgChk]) 
      { 
       while(buff[iMsgChk] == msgFormatChkOnlyDOC[iMsgChk] && iMsgChk < 5) 
        iMsgChk++; 
       if(iMsgChk == 4) 
        iRet = 2; 

      } 
      else if(buff[0] == msgFormatChkOnlyPPT[0]) 
      { 
       while(buff[iMsgChk] == msgFormatChkOnlyPPT[iMsgChk] && iMsgChk < 5) 
        iMsgChk++; 

       if(iMsgChk == 4) 
        iRet = 3; 
      } 
      else if(buff[0] == msgFormatChkOnlyXLS[0]) 
      { 

       while(buff[iMsgChk] == msgFormatChkOnlyXLS[iMsgChk] && iMsgChk < 9) 
        iMsgChk++; 

       if(iMsgChk == 9) 
        iRet = 4; 
      } 

      if(0 == iRet){ 
       iRet = 5; 
      } 
     } 
    } 


    CloseHandle(fileHandle); 

    return iRet; 
}

來源

2011-03-15 07:29:06 user660066

找出真正的文件類型

回答

相關問題