找到第一個可用位的最佳方法

給定一個未整理整數（每個4個八位字節）的數組，尋找具有至少一個'0'位的第一個元素的最佳方法是什麼，它是從LSB開始的索引。找到第一個可用位的最佳方法

e.g：其中n = 9

unsinged int uIntArray[] = { 
    0xffffffff, 
    0xffffffff, 
    0xffffffff, 
    0xffffffff, 
    0xffffff9f, 
    0x00000000, 
    0x00000000, 
    0x00000000, 
    0x00000000, 
};

答：

element's index = 4 
bit's index = 4

我只能想到：

int main (void) 
{ 
    bool found_f = false; 
    int n = 9; //our test case value 
    unsigned int uIntArray[] = { 
     0xffffffff, 
     0xffffffff, 
     0xffffffff, 
     0xffffffff, 
     0xffffff8f, 
     0x00000000, 
     0x00000000, 
     0x00000000, 
     0x00000000, 
    }; 

    unsigned int uIntBits [32] = { 
     1, 2, 4, 8, 16, 32, 64, 128, 
      256, 512, 1024, 2048, 4096, 8192, 16384, 32768, 
      65536, 131072, 262144, 524288, 1048576, 2097152, 4194304, 8388608, 
      16777216, 33554432, 67108864, 134217728, 268435456, 536870912, 1073741824, 2147483648 
    };  

    unsigned int idx, jdx; 
    int ele_idx = -1; 
    int bit_idx = -1; 

    for (idx =0; idx < n; idx ++) { 
     if (uIntArray[idx] < UINT_MAX) { /* our candidate */ 
      for (jdx =0; jdx < 32; jdx ++) { 
       if ((uIntBits[jdx] & uIntArray[idx])) { 
        ele_idx = idx; 
        bit_idx = jdx; 
        found_f = true; 
        break; 
       } 
      } 
     } 
     if(found_f) { 
      break; 
     } 
    } 
    fprintf (stderr, "\nEleIdx[%d] BitIdx[%d]\n", ele_idx, bit_idx); 
    return 0; 
}

有沒有更好的辦法做到這一點？

來源

2011-09-19 SparKot

你的邏輯可以收緊一點。 'found_f'是不必要的。數組也不是必需的。 – Skizz

首先，你聲明的第一個數組的正確答案是「（4,5）」而不是「（4，4）」。其次，您的實際代碼示例在數組中使用「0xffffff8f」值而不是原來的「0xffffff9f」，在這種情況下，正確的答案確實是「（4，4）」。不要混淆示例來混淆人們。 – AnT

x中最低有效0的索引是~x中最低有效1的索引。爲了找到後者，您只需要計算~x中的尾隨零。有不少方法可以做到這一點，看到這裏http://graphics.stanford.edu/~seander/bithacks.html#ZerosOnRightLinear

使用的最後一個方法（基於DeBruijn序列），搜索將如下

static const unsigned MultiplyDeBruijnBitPosition[32] = 
{ 
    0, 1, 28, 2, 29, 14, 24, 3, 30, 22, 20, 15, 25, 17, 4, 8, 
    31, 27, 13, 23, 21, 19, 16, 7, 26, 12, 18, 6, 11, 5, 10, 9 
}; 

for (idx = 0; idx < n; idx ++) 
    if (uIntArray[idx] < UINT_MAX) 
    break; 

if (idx < n) { 
    unsigned v = ~uIntArray[idx]; 
    int bit_idx = MultiplyDeBruijnBitPosition[((v & -v) * 0x077CB531u) >> 27]; 
    fprintf(stderr, "\nEleIdx[%d] BitIdx[%d]\n", idx, bit_idx); 
}

來源

2011-09-19 08:22:55 AnT

定時我的桌面上的一個測試程序： ARRAYSIZE 100所需的元件是在索引95 1 - MyOwn \t總時間採取[65535]迭代\t [0.027451512]秒\t \t 2 - 基於DeBruijn序列（ IMPRESSIVE） \t總時間採取[65535]迭代\t [0.019844761]秒\t \t - 下面一次一個4字節;找到候選組後在4內縮小範圍 \t總共花費的時間[65535]迭代\t [0.055722110]秒\t 非常感謝。 – SparKot

通過使用更大的數據類型可以使其更快。因此，不是測試每個int是否爲0xffffffff，您可以使用64位整數並對0xffffffffffffffff進行測試。

如果您願意進行矢量化，您一次可以做128位（SSE）或256位（AVX）。

在所有情況下，請注意您的數據對齊。如果你不對齊，它可能不起作用，或使它變慢。

最後一步，您可以實際展開循環並同時測試多個單詞/向量。這會給你更好的IPC。只有當你發現任何零點時，你纔會經歷縮小這一點的混亂局面。

編輯：

爲了說明這最後一點，你可以這樣做：（我省略了清理代碼的情況下，idx % 4 != 0）

for (idx =0; idx < n; idx += 4) { 
    unsigned int test = uIntArray[idx]; 
    test &= uIntArray[idx + 1]; 
    test &= uIntArray[idx + 2]; 
    test &= uIntArray[idx + 3]; 

    if (test < UINT_MAX){ 
     // Find which bit it is. 

    } 
}

除非你可以做到這一點對大數據類型。（像SSE/AVX載體）

這將使找到前0的區域快得多，但縮小確切位會更貴一些。所以如果你的數據量很大，這種方法會更好。

來源

2011-09-19 08:11:26 Mysticial

@Mystical：雖然矢量化和循環展開是使代碼運行更快的好方法，但不是「手動」執行代碼是錯誤的，但是您認爲在這種情況下它必須手動完成嗎？在大多數現代編譯器中，我們有可用於矢量化的選項，它們將在這些情況下展開循環展開（可能展開大於4的展開循環）：當他們生成機器代碼時...只是想知道你對此的看法 –

@ another.anon.coward：實際上大多數編譯器不會展開這個循環，因爲它有條件的副作用（'ele_idx = idx'）。 – Mysticial

啊錯過了！對不起......第一次弄錯了你的名字，非常感謝你的澄清！ –

要找到你可以觀察到的第一個元素，如果數量沒有有0位，那麼它必須是0xff..ff所以不是明確地檢查每一位，你可以簡單地把它比作0xff..ff。

要找到這個數字的最低有效位，我相信你仍然需要檢查每一位。

來源

2011-09-19 08:18:11 quasiverse

該公式產生最右邊的1位，而不是最右邊的0位。 – cnicutar

@cnicutar我正在編輯它，但我意識到它獲得了最右邊的位而不是位置的值。儘管如此，這大大簡化了用'0'位查找第一個元素。 – quasiverse

下面的代碼似乎很好地工作，用~x & (x + 1)測試另一個（現已刪除）答案並對其進行擴展。不知道爲什麼其他答案被刪除。

/* Return the position of the first clear bit in the array, 
* or -1 if none found. 
* arr: array of uint32_t to search 
* sz: number of elements in arr 
*/ 
int findClearBit(uint32_t *arr, int sz) 
{ 
    int i; 
    for (i = 0; i < sz; i++) { 
    if (~arr[i]) { 
     switch (~arr[i] & (arr[i] + 1)) { 
     case 1 << 31: return (i * 32) + 31; 
     case 1 << 30: return (i * 32) + 30; 
     case 1 << 29: return (i * 32) + 29; 
     case 1 << 28: return (i * 32) + 28; 
     case 1 << 27: return (i * 32) + 27; 
     case 1 << 26: return (i * 32) + 26; 
     case 1 << 25: return (i * 32) + 25; 
     case 1 << 24: return (i * 32) + 24; 
     case 1 << 23: return (i * 32) + 23; 
     case 1 << 22: return (i * 32) + 22; 
     case 1 << 21: return (i * 32) + 21; 
     case 1 << 20: return (i * 32) + 20; 
     case 1 << 19: return (i * 32) + 19; 
     case 1 << 18: return (i * 32) + 18; 
     case 1 << 17: return (i * 32) + 17; 
     case 1 << 16: return (i * 32) + 16; 
     case 1 << 15: return (i * 32) + 15; 
     case 1 << 14: return (i * 32) + 14; 
     case 1 << 13: return (i * 32) + 13; 
     case 1 << 12: return (i * 32) + 12; 
     case 1 << 11: return (i * 32) + 11; 
     case 1 << 10: return (i * 32) + 10; 
     case 1 << 9: return (i * 32) + 9; 
     case 1 << 8: return (i * 32) + 8; 
     case 1 << 7: return (i * 32) + 7; 
     case 1 << 6: return (i * 32) + 6; 
     case 1 << 5: return (i * 32) + 5; 
     case 1 << 4: return (i * 32) + 4; 
     case 1 << 3: return (i * 32) + 3; 
     case 1 << 2: return (i * 32) + 2; 
     case 1 << 1: return (i * 32) + 1; 
     case 1:  return (i * 32); 
     default:  return -1; 
     } 
    } 
    } 
    return -1; 
}

來源

2011-09-19 17:24:41 Dmitri

非常接近De Bruijn的序列;根據我的測試（在我的桌面[2097120]迭代中）上面的函數在最大時滯後只有0.4％。以上是[0.636051089]秒，而De Bruijn的序列是在[0.633582442]中完成的。這兩個相同的algos在僞裝！非常感謝德米特里。 – SparKot

不能加票;我需要15個聲望！ – SparKot

找到第一個可用位的最佳方法

回答

相關問題