也許這個例子會給你一些替代策略的想法。特別是,您可能可以將index_file
中的想法與Zoul關於在將文件句柄傳遞到XML::Twig
之前尋找位置的建議結合使用。
use strict;
use warnings;
# Index the XML file, storing start and end positions
# for each class in the document. You pay this cost only once.
sub index_file {
local @ARGV = (shift);
my (%index, $prev);
while (<>){
if (/^<class name=(\w+)>/) {
my $start = tell() - length();
$index{$1} = { start => $start, end => undef };
$index{$prev}{end} = $start - 1 if defined $prev;
$prev = $1;
}
$index{$prev}{end} = tell if eof;
}
return \%index;
}
# Use the index to retrieve the XML for a particular class.
# This allows you to jump quickly to any section of interest.
# It assumes that the sections of the XML document are small enough
# to be held in memory.
sub get_section {
my ($file_name, $class_name, $index) = @_;
my $ind = $index->{$class_name};
open(my $fh, '<', $file_name) or die $!;
seek $fh, $ind->{start}, 0;
read($fh, my $xml_section, $ind->{end} - $ind->{start});
return $xml_section;
}
# Example usage.
sub main {
my ($file_name) = @_;
my $index = index_file($file_name);
for my $cn (keys %$index){
# Process only sections of interest.
next unless $cn eq 'math' or $cn eq 'english';
my $xml = get_section($file_name, $cn, $index);
# Pass off to XML::Twig or whatever.
print $xml;
}
}
main(@ARGV);
來源
2010-08-02 13:17:15
FMc
非常聰明的解決方案。我會嘗試使用這種方法來查看性能。 – user399517 2010-08-02 20:14:38