grep或拼接在大陣列上

我有大量的散列，大約0.5Gb，存儲在內存中，我需要從中刪除一些元素，大約10％分佈在整個陣列周圍。grep或拼接在大陣列上

什麼是最有效的工作最好，做一個grep，或確定需要刪除的元素，並將它們拼接出來？

感謝，

西蒙娜

2011-09-28 simone

使用的內存無關緊要;陣列中有多少元素呢 – ysth

請看看[這篇文章]（http://stackoverflow.com/questions/4415287/how-can-i-delete-an-element-of-a-referenced-array/ 4415420＃4415420），這可能會有所幫助。 – Dallaylaen

splice可能會在您描述的條件下（因爲它會移動數組內容）轉到O（n^2），並且grep/slice會分配O（n）額外的內存（可能遠遠小於500GB，但仍然會）。 ..）。

沒有與雖然沒有額外的內存線性解決方案，但看起來更象C比如Perl：

sub inplace_grep { 
    my ($code, $array) = @_; 
    # move elements backwards 
    for (my ($to, $from)=(0,0); $from < @$array; $from++) { 
     $code->($array->[$from]) or next; 
     $array->[$to++] = $array->[$from]; 
    }; 
    # remove tail 
    splice @$array, $to; 
};

更新：在grep的內存使用情況 - 你可以做額外的內存快速測試通過使用大量數據進行分配並尋找系統調用brk。在我的系統（linux，perl 5.10）上它有。

strace -e trace=brk perl -MTime::HiRes -wle \ 
'print "start ".time; my @array = 1..10**7; print "alloc ".time; 
@array = grep { $_ %2 } @array; print "grep ".time'

來源

2011-09-30 09:11:32 Dallaylaen