2013-05-02 119 views
2

我有一組在25k迭代循環內被修改的字符串。它在開始時是空的,但是在每個循環中隨機添加或移除0-200個字符串。最後,該集合包含大約80k個字符串。
我想讓它恢復原狀。該設置應該在每個週期後保存到磁盤,並在簡歷中加載。
我可以使用什麼庫?原始數據量約爲16M,但變化通常較小。我不希望它在每次迭代中重寫整個商店。perl:堅持支持提交支持的字符串集合

由於字符串路徑,我想將它們存儲在這樣的日誌文件:

+a 
+b 
commit 
-b 
+d 
commit 

一開始將文件加載到一個哈希,然後壓實。如果最後沒有提交行,則不考慮最後一個塊。

回答

1

Storable package爲您的Perl數據結構(SCALAR,ARRAY,HASH或REF對象)帶來持久性,即任何可以方便地存儲到磁盤並在以後檢索的任何東西。

+0

根據描述,它不能只寫了改變。我需要更類似DB的東西,用INSERT,DELETE和COMMIT – basin 2013-05-02 09:06:14

+0

如何使用數據庫呢?或者你爲什麼不想使用它? http://search.cpan.org/~rurban/DBD-SQLite2-0.36/lib/DBD/SQLite2.pm是自包含的。 – Matthias 2013-05-02 09:39:14

0

我決定收起重炮和寫的東西很簡單:

package LoL::IMadeADb; 

sub new { 
    my $self; 
    (my $class, $self->{dbname}) = @_; 
    # open for read, then write. create if not exist 
    #msg "open $self->{dbname}"; 
    open(my $fd, "+>>", $self->{dbname}) or die "cannot open < $self->{dbname}: $!"; 
    seek($fd, 0, 0); 
    $self->{fd} = $fd; 
    #msg "opened"; 
    $self->{paths} = {}; 
    my $href = $self->{paths}; 

    $self->{nlines} = 0; 
    my $lastcommit = 0; 
    my ($c, $rest); 
    while(defined($c = getc($fd)) && substr(($rest = <$fd>), -1) eq "\n") { 
    $self->{nlines}++; 
    chomp($rest); 
    if ($c eq "c") { 
     $lastcommit = tell($fd); 
     #msg "lastcommit: " . $lastcommit; 
    } elsif ($c eq "+") { 
     $href->{$rest} = undef; 
    } elsif ($c eq "-") { 
     delete $href->{$rest}; 
    } 
    #msg "line: '" . $c . $rest . "'"; 
    } 
    if ($lastcommit < tell($fd)) { 
    print STDERR "rolling back incomplete file: " . $self->{dbname} . "\n"; 
    seek($fd, $lastcommit, 0); 
    while(defined($c = getc($fd)) && substr(($rest = <$fd>), -1) eq "\n") { 
     $self->{nlines}--; 
     chomp($rest); 
     if ($c eq "+") { 
     delete $href->{$rest}; 
     } else { 
     $href->{$rest} = undef; 
     } 
    } 
    truncate($fd, $lastcommit) or die "cannot truncate $self->{dbname}: $!"; 
    print STDERR "rolling back incomplete file; done\n"; 
    } 
    #msg "entries = " . (keys(%{ $href })+0) . ", nlines = " . $self->{nlines} . "\n"; 
    bless $self, $class 
} 

sub add { 
    my ($self , $path) = @_; 
    if (!exists $self->{paths}{$path}) { 
    $self->{paths}{$path} = undef; 
    print { $self->{fd} } "+" . $path . "\n"; 
    $self->{nlines}++; 
    $self->{changed} = 1; 
    } 
    undef 
} 

sub remove { 
    my ($self , $path) = @_; 
    if (exists $self->{paths}{$path}) { 
    delete $self->{paths}{$path}; 
    print { $self->{fd} } "-" . $path . "\n"; 
    $self->{nlines}++; 
    $self->{changed} = 1; 
    } 
    undef 
} 

sub save { 
    my ($self) = @_; 
    return undef unless $self->{changed}; 
    my $fd = $self->{fd}; 
    my @keys = keys %{$self->{paths}}; 
    if ($self->{nlines} - @keys > 5000) { 
    #msg "compacting"; 
    close($fd); 
    my $bkpdir = dirname($self->{dbname}); 
    ($fd, my $bkpname) = tempfile(DIR => $bkpdir , SUFFIX => ".tmp") or die "cannot create backup file in: $bkpdir: $!"; 
    $self->{nlines} = 1; 
    for (@keys) { 
     print { $fd } "+" . $_ . "\n" or die "cannot write backup file: $!"; 
     $self->{nlines}++; 
    } 
    print { $fd } "c\n"; 
    close($fd); 
    move($bkpname, $self->{dbname}) 
     or die "cannot rename " . $bkpname . " => " . $self->{dbname} . ": $!"; 
    open($self->{fd}, ">>", $self->{dbname}) or die "cannot open < $self->{dbname}: $!"; 
    } else { 
    print { $fd } "c\n"; 
    $self->{nlines}++; 

    # flush: 
    my $previous_default = select($fd); 
    $| ++; 
    $| --; 
    select($previous_default); 
    } 
    $self->{changed} = 0; 
    #print "entries = " . (@keys+0) . ", nlines = " . $self->{nlines} . "\n"; 
    undef 
} 
1;