2012-12-15 29 views
0

我已經編寫了以下perl代碼來讀取文本文件(a1.txt)並對時間戳進行平均。我想同時讀取兩個文件(a1.txt和a2.txt)併合並來自兩個文件的所有列。如何在perl中合併來自兩個不同文件的列

下面的代碼一次只能讀取一個文件。請幫助我修改我的下面的Perl代碼並以如下格式輸出結果。

a1.txt

PERFORMANCE TESTING 


------------------------------------------------------------------- 
PERF_SMK_OCUS_50 Version P-20-17 
------------------------------------------------------------------- 
300_wireframe_view_redraws_(GR) 00:01:56 

80_wireframe_view_redraws_with_DATUMS_on_(GR) 00:00:51 

3_hidden_view_redraws_(GR) 00:01:35 

6_Fast_HLR_activations_(CP) 00:01:10 

120_hidden_view_redraws_with_Fast_HLR_(GR) 00:00:42 

2_shaded_mouse_spins_(GR) 00:00:21 

270_shaded_view_redraws_(GR) 00:01:39 
------------------------------------------------------------------- 

**************************************************** 
**************************************************** 
------------------------------------------------------------------- 
PERF_SMK_OCUS_50 Version P-20-17 
------------------------------------------------------------------- 
300_wireframe_view_redraws_(GR) 00:01:56 

80_wireframe_view_redraws_with_DATUMS_on_(GR) 00:00:51 

3_hidden_view_redraws_(GR) 00:01:35 

6_Fast_HLR_activations_(CP) 00:01:09 

120_hidden_view_redraws_with_Fast_HLR_(GR) 00:00:42 

2_shaded_mouse_spins_(GR) 00:00:20 

270_shaded_view_redraws_(GR) 00:01:39 
------------------------------------------------------------------- 

**************************************************** 
**************************************************** 
------------------------------------------------------------------- 
PERF_SMK_OCUS_50 Version P-20-17 
------------------------------------------------------------------- 
300_wireframe_view_redraws_(GR) 00:01:55 

80_wireframe_view_redraws_with_DATUMS_on_(GR) 00:00:50 

3_hidden_view_redraws_(GR) 00:01:34 

6_Fast_HLR_activations_(CP) 00:01:09 

120_hidden_view_redraws_with_Fast_HLR_(GR) 00:00:40 

2_shaded_mouse_spins_(GR) 00:00:21 

270_shaded_view_redraws_(GR) 00:01:35 
------------------------------------------------------------------- 

**************************************************** 
**************************************************** 

a2.txt

PERFORMANCE TESTING 

------------------------------------------------------------------- 
PERF_SMK_OCUS_50 Version P-20-17 
------------------------------------------------------------------- 
80_wireframe_view_redraws_with_DATUMS_on_(GR) 00:00:50 

3_hidden_view_redraws_(GR) 00:01:37 

6_Fast_HLR_activations_(CP) 00:01:12 

120_hidden_view_redraws_with_Fast_HLR_(GR) 00:00:43 

2_shaded_mouse_spins_(GR) 00:00:21 

270_shaded_view_redraws_(GR) 00:01:35 

240_realtime_rendered_redraws_(GR)_1 00:13:16 
------------------------------------------------------------------- 

**************************************************** 
**************************************************** 
------------------------------------------------------------------- 
PERF_SMK_OCUS_50 Version P-20-17 
------------------------------------------------------------------- 
80_wireframe_view_redraws_with_DATUMS_on_(GR) 00:00:50 

3_hidden_view_redraws_(GR) 00:01:37 

6_Fast_HLR_activations_(CP) 00:01:12 

120_hidden_view_redraws_with_Fast_HLR_(GR) 00:00:42 

2_shaded_mouse_spins_(GR) 00:00:20 

270_shaded_view_redraws_(GR) 00:01:40 

240_realtime_rendered_redraws_(GR)_1 00:13:14 
------------------------------------------------------------------- 

**************************************************** 
**************************************************** 
------------------------------------------------------------------- 
PERF_SMK_OCUS_50 Version P-20-17 
------------------------------------------------------------------- 
80_wireframe_view_redraws_with_DATUMS_on_(GR) 00:00:50 

3_hidden_view_redraws_(GR) 00:01:37 

6_Fast_HLR_activations_(CP) 00:01:12 

120_hidden_view_redraws_with_Fast_HLR_(GR) 00:00:44 

2_shaded_mouse_spins_(GR) 00:00:20 

270_shaded_view_redraws_(GR) 00:01:40 

240_realtime_rendered_redraws_(GR)_1 00:13:24 
------------------------------------------------------------------- 

**************************************************** 
**************************************************** 

所需的輸出:

> Test Cases         a1.txt timestamp (hh:mm:ss)  a2.txt(hh:mm:ss)  delta (a1 -a2)(hh:mm:ss) 
>---------------------------------------------------------------------------------------------------------------- 
>240_realtime_rendered_redraws_(GR)_1   N/A       00:13:18    N/A 

> 3_hidden_view_redraws_(GR)      00:01:34      00:01:37   -00:00:03 

> 270_shaded_view_redraws_(GR)     00:01:37      00:01:38   -00:00:01 

> 120_hidden_view_redraws_with_Fast_HLR_(GR)  00:00:41      00:00:43   -00:00:02 

> 300_wireframe_view_redraws_(GR)    00:01:55      N/A     N/A 

> 2_shaded_mouse_spins_(GR)      00:00:20      00:00:20   00:00:00 

> 6_Fast_HLR_activations_(CP)     00:01:09      00:01:12   -00:00:03 

> 80_wireframe_view_redraws_with_DATUMS_on_(GR) 00:00:50      00:00:50   00:00:00 

我的代碼:

my %retrieve; 
my $count = 0; 

my $file1 = 'a1.txt'; 

open (R, $file1) or die ("Could not open $file1!"); 

while (<R>) { 

    next unless /^*Retrieve_generic_/ || 
       /^*Retrieve_assembly_1_/ || 
       /^*Retrieve_assembly_2_/ || 
       /^*300_wireframe_view_/ || 
       /^*80_wireframe_view_/ || 
       /^*3_hidden_view_/ || 
       /^*Fast_HLR_/ || 
       /^*120_hidden_view_/ || 
       /^*shaded_view_/ || 
       /^*shaded_mouse_/ || 
       /^*realtime_rendered_/; 
    $count++; 
    my ($retrieve, $time) = split; 
    my ($h, $m, $s) = split ':', $time; 
    $retrieve{$retrieve} += $h * 3600 + $m * 60 + $s; 

} 
close(R); 

for my $retrieve (keys %retrieve) { 

    my $hms = secondsToHMS($retrieve{$retrieve}/(3)); 
    print "$retrieve\t$hms\n" if defined $hms; 
} 

# For seconds < 86400, else undef returned 

sub secondsToHMS { 
    my $seconds = $_[0]; 
    return undef if $seconds >= 86400; 

    my $h = int $seconds/3600; 
    my $m = int($seconds - $h * 3600)/60; 
    my $s = $seconds % 60; 

    return sprintf('%02d:%02d:%02d', $h, $m, $s); 
} 
+0

交叉到PerlMonks:http://www.perlmonks.org/?node_id=1008972 – DavidO

回答

0

嘗試......

#!/usr/bin/perl -w 

use strict; 

sub t2i { 
    my @v=split(":",$_[0]); 
    return $v[0]*3600+$v[1]*60+$v[2]; 
}; 
sub i2t { 
    return sprintf "%02d:%02d:%02d", $_[0]/3600,$_[0]/60%60,$_[0]%60; 
}; 
my %hash; 

foreach my $file (qw|a1 a2|) { 
    open my $fh,"<".$file.".txt" or die; 
    while (<$fh>) { 
    $hash{$1}{$file}=t2i($2) if 
     /^(\d+_\S+_\S+_\S+)\s(\d+:\d+:\d+)/; 
    }; 
    close $fh; 
}; 
map { 
    printf "%-50s %s %s %s\n", $_, 
     i2t($hash{$_}{'a1'}), i2t($hash{$_}{'a1'}), 
     i2t($hash{$_}{'a1'} - $hash{$_}{'a2'}) if 
     defined($hash{$_}{'a1'}) && defined($hash{$_}{'a2'}); 
} keys %hash; 

,給:

80_wireframe_view_redraws_with_DATUMS_on_(GR)  00:00:50 00:00:50 00:00:00 
2_shaded_mouse_spins_(GR)       00:00:21 00:00:21 00:00:01 
270_shaded_view_redraws_(GR)      00:01:35 00:01:35 00:00:55 
3_hidden_view_redraws_(GR)       00:01:34 00:01:34 00:00:57 
120_hidden_view_redraws_with_Fast_HLR_(GR)   00:00:40 00:00:40 00:00:56 
6_Fast_HLR_activations_(CP)      00:01:09 00:01:09 00:00:57 

或排序和更好的分手:

#!/usr/bin/perl -w 

use strict; 
my %joinHash; 
my %files=('a'=>'a1.txt','b'=>'a2.txt'); 

sub readFile { 
    open my $fh,"<".$files{$_[0]} or die; 
    while (my $line=<$fh>) { 
    $joinHash{$1}{$_[0]}=timeToInteger($2) if 
     $line =~ /^(\d+_\S+_\S+_\S+)\s(\d+:\d+:\d+)/; 
    }; 
    close $fh; 
}; 
sub timeToInteger { 
    my ($hour,$mins,$secs)=split(":",$_[0]); 
    return $hour*3600+$mins*60+$secs; 
}; 
sub integerToTime { 
    return sprintf "%02d:%02d:%02d", $_[0]/3600,$_[0]/60%60,$_[0]%60; 
}; 

foreach my $fileKey (keys %files) { readFile $fileKey }; 

map { 
    my ($aVal,$bVal)=(0,0); 
    $aVal=$joinHash{$_}{'a'} if defined $joinHash{$_}{'a'}; 
    $bVal=$joinHash{$_}{'b'} if defined $joinHash{$_}{'b'}; 
    printf "%-50s %s %s %s\n", $_, 
     integerToTime($aVal), integerToTime($bVal), 
     integerToTime($aVal-$bVal); 
} sort { 
    (my $x=$a)=~s/_.*$//g; 
    (my $y=$b)=~s/_.*$//g; 
    $x<=>$y 
} keys %joinHash; 

給數字排序的輸出(空填充空值)

2_shaded_mouse_spins_(GR)       00:00:21 00:00:20 00:00:01 
3_hidden_view_redraws_(GR)       00:01:34 00:01:37 00:00:57 
6_Fast_HLR_activations_(CP)      00:01:09 00:01:12 00:00:57 
80_wireframe_view_redraws_with_DATUMS_on_(GR)  00:00:50 00:00:50 00:00:00 
120_hidden_view_redraws_with_Fast_HLR_(GR)   00:00:40 00:00:44 00:00:56 
240_realtime_rendered_redraws_(GR)_1    00:00:00 00:13:24 00:47:36 
270_shaded_view_redraws_(GR)      00:01:35 00:01:40 00:00:55 
300_wireframe_view_redraws_(GR)     00:01:55 00:00:00 00:01:55 

編輯3完全可用的工具!

現在有可能以文件作爲參數和一些交換機來運行排序控制

#!/usr/bin/perl -w 
# Demo of parsing via hash variable 
# using Getopt and different sort methods 
# (C) 2012 F-Hauri.ch - Use, copy , distribute or modify via License LGPL V3. 

use strict; 
use Getopt::Std; 

my $formatString="> %-45s%-20s%-20s%s\n"; 
my @files=qw|a1.txt a2.txt|; 
my %opt; 
my %joinHash; 

sub usage { 
    print <<eousage ; 
Usage: $0 [-a|-b|-r|-c|-n] [file1] [file2] 
    -a Sort by file A times 
    -b Sort by file B times 
    -r Sort by result times 
    -c Sort alphabeticaly by case name 
    -C Sort alphabeticaly by case name (Case insensitive) 
    -n Sort numericaly by case num (default) 
    -R Reverse sort order 
    file1 and file2 are by default: '$files[0]' and '$files[1]'. 
eousage 
exit 0; 
} 
sub mydie { 
    printf STDERR "Error: %s\n",$_[0]; 
    usage(); 
} 
sub readFile { 
    open my $fh,"<".$files[$_[0]] or mydie "Can't open '$files[$_[0]]'."; 
    while (my $line=<$fh>) { 
    $joinHash{$1}[$_[0]]=timeToInt($2) if 
     $line =~ /^(\d+_\S+_\S+_\S+)\s(\d+:\d+:\d+)/; 
    }; 
    close $fh; 
}; 
sub timeToInt { 
    my ($hour,$mins,$secs)=split(":",$_[0]); 
    return $hour*3600+$mins*60+$secs; 
}; 
sub intToTime { 
    my $sign=' '; 
    $sign='-' if $_[0] < 0; 
    return sprintf "%s%02d:%02d:%02d", $sign, $_[0]/3600,$_[0]/60%60,$_[0]%60; 
}; 
sub getJoined { 
    # $_0 = caseName, $_1 = filenr (0,1) or result (2), $_2 = flag: toNumber 
    my $asNumber=$_[2]; 
    my $default=do{$asNumber ? 9e9 : ' N/A' }; 
    return map { getJoined($_[0],$_,$asNumber) } (0..2) unless defined $_[1]; 
    my $index =$_[1]; 
    my @[email protected]{$joinHash{$_[0]}}; 
    return do { defined $crtLine[$index] ? 
       do { $asNumber ? 
         $crtLine[$index] : intToTime($crtLine[$index]) } 
      : $default } if $index lt 2; 
    return $default unless defined($crtLine[0]) && defined($crtLine[1]); 
    return do { $asNumber ? $crtLine[0] - $crtLine[1] : 
       intToTime($crtLine[0] - $crtLine[1]) }; 
} 
sub sortByOpt { 
    my ($x,$y)[email protected]_; 
    if ($opt{'c'} || $opt{'C'}) {  # sort by Case name 
    $x =~ s/^\d+_//g; $y =~ s/^\d+_//g; 
    if ($opt{'C'}) { 
     $x=~tr|a-z|A-Z|; 
     $y=~tr|a-z|A-Z|; 
    }; 
    ($y,$x)=($x,$y) if $opt{'R'}; 
    return $x cmp $y; 
    } elsif ($opt{'a'}||$opt{'b'}||$opt{'r'}) { # sort by times 
    my $abr=0;        # default to `a` 
    $abr=1 if $opt{'b'}; 
    $abr=2 if $opt{'r'}; 
    $x = getJoined($x,$abr,1); 
    $y = getJoined($y,$abr,1); 
    } else {    # sort numericaly by case number 
    $x =~ s/_.*$//g; $y =~ s/_.*$//g; 
    }; 
    ($y,$x)=($x,$y) if $opt{'R'}; 
    return $x<=>$y; 
} 

getopts('abCchnRr',\%opt) or mydie 'Unknow option.'; 
usage if ($opt{'h'}); 

foreach my $fileKey (0..1) { 
    if (defined($ARGV[$fileKey])) { 
    mydie 'Arg "'.$ARGV[$fileKey].'" is not a file.' unless 
     -f $ARGV[$fileKey]; 
    $files[$fileKey]=$ARGV[$fileKey]; 
    }; 
    readFile $fileKey 
}; 

my @fileNames=map {s/.txt$//;$_} @files; 
my $headLine=sprintf $formatString, 'Test Cases', 
    map {' '.$_.'(hh:mm:ss)'} @fileNames, 'delta ('.join("-",@fileNames).')'; 
print $headLine.('-' x (length($headLine) - 1))."\n"; 

map { 
    printf $formatString, $_, getJoined($_); 
} sort { sortByOpt($a,$b) } keys %joinHash; 

其中一個工具:

Usage: ./mycode.pl [-a|-b|-r|-c|-n] [file1] [file2] 
    -a Sort by file A times 
    -b Sort by file B times 
    -r Sort by result times 
    -c Sort alphabeticaly by case name 
    -C Sort alphabeticaly by case name (Case insensitive) 
    -n Sort numericaly by case num (default) 
    -R Reverse sort order 
    file1 and file2 are by default: 'a1.txt' and 'a2.txt'. 

這樣:

./mycode.pl -RC d1.txt d2.txt 
> Test Cases         d1(hh:mm:ss)  d2(hh:mm:ss)  delta (d1-d2)(hh:mm:ss) 
--------------------------------------------------------------------------------------------------------------- 
> 80_wireframe_view_redraws_with_DATUMS_on_(GR) 00:00:50   00:00:50   00:00:00 
> 300_wireframe_view_redraws_(GR)    N/A     00:01:55   N/A 
> 270_shaded_view_redraws_(GR)     00:01:40   00:01:35   00:00:05 
> 2_shaded_mouse_spins_(GR)      00:00:20   00:00:21   -00:00:59 
> 240_realtime_rendered_redraws_(GR)_1   00:13:24   N/A     N/A 
> 6_Last_HLR_activations_(CP)     00:01:12   00:01:09   00:00:03 
> 120_hidden_view_redraws_with_Last_HLR_(GR) 00:00:44   00:00:40   00:00:04 
> 3_hidden_view_redraws_(GR)     00:01:37   00:01:34   00:00:03 

諾塔:我已將a1.txt複製到d2.txta2.txtd1.txt和修改(與sed)s/Fast/Last/有一個第一個較高的字母后面的字母表比第一個更低...

+3

爲了清晰和可維護性,將有意義的名稱分配給變量和子例程不需要太長的時間,特別是在傳遞代碼時到OP或者你不打高爾夫球。另外,你爲什麼[在無效的上下文中使用'map'](http://stackoverflow.com/questions/4174492/in-perl-is-it-appropriate-to-use-map-in-void-context- INSTEAD-OF-A-的foreach循環/ 4174644#4174644)? – Kenosis

+2

它不一定是*教學代碼*。如果您的代碼不言自明,您的代碼更易於維護,無論是由其他人還是由您自己進行維護,並且如果您需要幫助,只需使代碼儘可能易讀即可。最好的方法之一就是使用好的標識符。 – Borodin

+0

我需要更改以下行以及我的($ a1,$ a2)= @ {$ time_for {$ subject}} {qw(a1 a2)};從命令行獲得a1.txt和a2.txt – Amit

2

這是我如何去做這件事。

#!/usr/bin/perl -Tw 

use strict; 
use warnings; 
use English qw(-no_match_vars $OS_ERROR); 

die 'expecting two filenames as arguments' 
    if @ARGV != 2; 

my @ids; 

my %time_for; 

for my $filename (@ARGV) { 

    my $id; 

    if ($filename =~ m{\A (.+? /)?([^/.]+?)([.] \w+) \z}xms) { 
     my $path = $1 || ""; 
     my $name = $2; 
     my $ext = $3 || ""; 
     $id  = $name; 
     $filename = "$path$name$ext"; 
     push @ids, $id; 
    } 

    die "cant parse file ID from $filename" 
     if !$id; 

    die "cant find $filename" 
     if !stat $filename; 

    open my $fh, '<', "$filename" 
     or die "open $filename: $OS_ERROR"; 

    while (my $line = <$fh>) { 

     if ($line =~ m{\A (\w+ \(\w+ \) \w*) \s+ (\d+:\d+:\d+) }xms) { 

      my ($subject, $hms) = ($1, $2); 

      my $seconds = hms_to_sec($hms); 

      $time_for{$subject}->{$id} ||= $seconds; 

      $time_for{$subject}->{$id} 
       = ($seconds + $time_for{$subject}->{$id})/2; 
     } 
    } 

    close $fh, 
     or die "close $filename: $OS_ERROR"; 
} 

print <<"HEAD"; 
> Test Cases          $ids[0] timestamp (hh:mm:ss)   $ids[1] (hh:mm:ss)   delta ($ids[0]-$ids[1])(hh:mm:ss) 
> ------------------------------------------------------------------------------------------------------------------------------ 
HEAD 

for my $subject (sort keys %time_for) { 

    my ($a1, $a2) = @{ $time_for{$subject} }{@ids}; 

    my $delta = defined $a1 && defined $a2 ? $a1 - $a2 : undef; 

    printf "> % -46s % -32s % -21s %s\n\n", 
     $subject, 
     sec_to_hms($a1), 
     sec_to_hms($a2), 
     sec_to_hms($delta); 
} 

sub hms_to_sec { 
    my ($h, $m, $s) = map { int $_ } map { $_ ? $_ : 0 } split /:/, $_[0]; 
    return $h * 3_600 + $m * 60 + $s; 
} 

sub sec_to_hms { 
    my ($s) = @_; 

    return 'N/A' 
     if !defined $s || $s > 86_400; 

    my $sign = ' '; 

    if ($s < 0) { 
     $sign = '-'; 
     $s *= -1; 
    } 

    my $h = int $s/3_600; 
    my $m = int ($s - $h * 3_600)/60; 

    return sprintf '%s%02d:%02d:%02d', $sign, $h, $m, $s % 60; 
} 

輸出出來這樣。

> Test Cases          a1.txt timestamp (hh:mm:ss)  a2.txt(hh:mm:ss)  delta (a1 -a2)(hh:mm:ss) 
> ------------------------------------------------------------------------------------------------------------------------------ 
> 120_hidden_view_redraws_with_Fast_HLR_(GR)  00:00:41       00:00:43    -00:00:02 

> 240_realtime_rendered_redraws_(GR)_1   N/A        00:13:19    -00:13:19 

> 270_shaded_view_redraws_(GR)     00:01:37       00:01:38    -00:00:01 

> 2_shaded_mouse_spins_(GR)      00:00:20       00:00:20    00:00:00 

> 300_wireframe_view_redraws_(GR)     00:01:55      N/A     00:01:55 

> 3_hidden_view_redraws_(GR)      00:01:34       00:01:37    -00:00:02 

> 6_Fast_HLR_activations_(CP)      00:01:09       00:01:12    -00:00:02 

> 80_wireframe_view_redraws_with_DATUMS_on_(GR) 00:00:50       00:00:50    00:00:00 

假定文件名使用/作爲路徑分隔符。 (適當的便攜式實現可能是另一個問題的主題。)

您可以致電此類似:

./merge_columns.pl /some/path/a1.txt /another/path/a2.txt 

我希望這是有幫助的。

+1

我想傳遞a1.txt和a2.txt(兩個文件位於不同的目錄中)從上面的代碼中的命令行。 – Amit

+0

我需要更改以下行以及我的($ a1,$ a2)= @ {$ time_for {$ subject}} {qw(a1 a2)};從命令行得到a1.txt和a2.txt – Amit

+0

我的意思是如果我想處理除a1.txt或a2.txt以外的文件,那麼這將不起作用 – Amit

相關問題