2011-06-10 62 views
1

我有一個文件內的一個文件,它是存在以下選擇周圍圍繞着丟失的序列號線

TEST_4002_sample11_1_20110531.TXT 
TEST_4002_sample11_2_20110531.TXT 
TEST_4002_sample11_4_20110531.TXT 
TEST_4002_sample11_5_20110531.TXT 
TEST_4002_sample11_6_20110531.TXT 
TEST_4002_sample10_1_20110531.TXT 
TEST_4002_sample10_2_20110531.TXT 
TEST_4002_sample10_4_20110531.TXT 
TEST_4002_sample10_5_20110531.TXT 

我想,如果提起該文件序列的第4個缺少輸出給定,然後打印一個文件名稱和下一個文件名稱作爲輸出。

TEST_4002_sample11_2_20110531.TXT 
TEST_4002_sample11_4_20110531.TXT 
TEST_4002_sample10_2_20110531.TXT 
TEST_4002_sample10_4_20110531.TXT 
+1

嗯。我真的不明白你爲什麼關閉這個問題。這是具體解決方案的真實世界編程問題(正如你在答案中看到的那樣)。提供了示例輸入和所需輸出。 – jm666 2011-06-14 07:19:24

回答

1

這awk的變種似乎產生所需的輸出:

awk -F_ '$4>c+1{print p"\n"$0}{p=$0;c=$4}' 
+0

您給出的非常簡單的perl腳本。感謝您的迴應 – gyrous 2011-06-14 04:52:01

+0

這不是Perl,它是awk。 – Qtax 2011-06-14 05:30:51

0

在Perl中,你可以做這樣的事情:

use strict; 
use warnings; 

my $prev_line; 
my $prev_val; 

while(<>){ 
    # get the 4th value 
    my $val = (split '_')[3]; 

    # skip if invalid line 
    next if !defined $val; 

    # print if missed sequence 
    if(defined($prev_val) && $val > $prev_val + 1){ 
     print $prev_line . $_; 
    } 

    # save for next iteration 
    $prev_line = $_; 
    $prev_val = $val; 
} 

保存在foo.pl並且用類似運行:

cat file.txt | perl foo.pl 

我敢肯定,這可以縮短了很多。可以使用這樣的事情,如果所有的線條都有效:

perl -n -e '$v=(/[^_]/g)[3];print"$l$_"if$l&&$v>$p+1;$p=$v;$l=$_' file.txt 

perl -naF_ -e '$v=$F[3];print"$l$_"if$l&&$v>$p+1;$p=$v;$l=$_' file.txt 
+0

感謝您的回覆 – gyrous 2011-06-14 04:51:03

0

據我知道你需要什麼,這裏是一個Perl腳本,做的工作:

#!/usr/local/bin/perl 
use strict; 
use warnings; 

my $prev = ''; 
my %seq1; 
while(<DATA>) { 
    chomp; 
    my ($seq1, $seq2) = $_ =~ /^.*?(\d+)_(\d+)_\d+\.TXT$/; 
    $seq1{$seq1} = $seq2 - 1 unless exists $seq1{$seq1}; 
    if ($seq1{$seq1}+1 != $seq2) { 
     print $prev,"\n",$_,"\n"; 
    } 
    $prev = $_; 
    $seq1{$seq1} = $seq2; 
} 


__DATA__ 
TEST_4002_sample11_1_20110531.TXT 
TEST_4002_sample11_2_20110531.TXT 
TEST_4002_sample11_4_20110531.TXT 
TEST_4002_sample11_5_20110531.TXT 
TEST_4002_sample11_6_20110531.TXT 
TEST_4002_sample10_1_20110531.TXT 
TEST_4002_sample10_2_20110531.TXT 
TEST_4002_sample10_4_20110531.TXT 
TEST_4002_sample10_5_20110531.TXT 

輸出:

TEST_4002_sample11_2_20110531.TXT 
TEST_4002_sample11_4_20110531.TXT 
TEST_4002_sample10_2_20110531.TXT 
TEST_4002_sample10_4_20110531.TXT 
0

我以前glob需要的檔案(這是可能的,它是作爲<TEST_*.TXT>一樣簡單)。

use strict; 
use warnings; 

my %last = (name => '', group => '', seq => 0); 

foreach my $file (sort glob('TEST_[0-9][0-9][0-9][0-9]_sample[0-9][0-9]_[0-9]_*.TXT') 
    ) { 
    my ($group, $seq) = $file =~ m/(\d{4,}_sample\d+)_(\d+)/; 
    if ($group eq $last{group} && $seq - $last{seq} > 1) { 
     print join("\n", $last{name}, $file, ''); 
    } 
    @last{ qw<name group seq> } = ($file, $group, $seq); 
} 
1

簡單的Perl方式:

perl -F_ -lane 'print "$o\n$_" if $F[3]-$n>1;$o=$_;$n=$F[3]' < file 
+0

感謝您的perl命令。 – gyrous 2011-06-14 04:50:48