2015-08-15 93 views
0

我想輸入一個文件,並使用文件每行的元素(每行一個awk命令)調用awk。現在我正在使用perl腳本創建一個數組,並打印我想要運行的awk命令行。但是,我確信有一個更好的方法可以在perl腳本中真正運行awk命令。在perl腳本中調用awk

文件看起來像:下面

1 rs78641116 8374297 3374297 13374297 
1 rs34269918 8424984 3424984 13424984 
1 rs533123 29141155 24141155 34141155 
1 rs1498232 30433951 25433951 35433951 

代碼:

#! perl -w 

open(my $file, "<", "sim.snps") or die $!; 
while (<$file>) { 
    my @snps=split; 
    print "awk \'\$2>=$snps[3]\&\&\$2<=$snps[4]\{print\$1,\$2,\$3,\$4\}\' \..\/phasing_and_imputation\/1000GP_Phase3_chr$snps[0].legend > $snps[1]\.legend\n" 
    } 

輸出AWK命令:

awk '$2>=3374297&&$2<=13374297{print$1,$2,$3,$4}' ../phasing_and_imputation/1000GP_Phase3_chr1.legend > rs78641116.legend 
awk '$2>=3424984&&$2<=13424984{print$1,$2,$3,$4}' ../phasing_and_imputation/1000GP_Phase3_chr1.legend > rs34269918.legend 
awk '$2>=24141155&&$2<=34141155{print$1,$2,$3,$4}' ../phasing_and_imputation/1000GP_Phase3_chr1.legend > rs533123.legend 
awk '$2>=25433951&&$2<=35433951{print$1,$2,$3,$4}' ../phasing_and_imputation/1000GP_Phase3_chr1.legend > rs1498232.legend 

有沒有人有一個解決的awk,而不是運行在打印的awk命令?

+5

......你已經在使用Perl了。你爲什麼要致電awk? Perl可以做到awk所能做的所有事情,只需要儘可能多的代碼。 –

+0

我是一個perl新手。不知道如何簡單地做到這一點。 – theo4786

+1

首先將每個'awk'輸出命令轉換爲Perl,然後將結果合併到'while'循環中。 –

回答

4

下面是一個解決您的問題的Perl解決方案,據我所知。

#!/usr/bin/env perl 

use strict; 
use warnings; 
use autodie; # avoid a bunch of `or die` clauses 

# First, load the criteria for splitting into the output files 
my %files =(); 

# extra block level wrapping the $ranges file access; file 
# is automatically closed at the end of the block 
{ 
    open my $ranges, '<', 'sim.snps'; 
    while (<$ranges>) { 
    (undef, my $key, undef, my ($min, $max)) = split; 
    $files{$key} = { min => $min, max => $max }; 

    # go ahead and open the output file while we're here 
    open $files{$file}{fh}, '>', "$key.legend"; 
    } 
} # $ranges filehandle closed here 

# another file-access block 
{ 
    # open the data file 
    open my $data, '<', '../phasing_and_imputation/1000GP_Phase3_chr1.legend'; 

    while (<$data>) { 
    # split the data into fields 
    my @f = split; 

    # loop over the output files and write the relevant parts of this line 
    # to the ones that want it 
    while (my ($file, $data) = each %files) { 
     if ($f[1] >= $data->{min} && $f[1] <= $data->{max}) { 
     print { $data->{fh} } join(' ', @f[0..3]), "\n"; 
     } 
    } 
    } 
} # data file closed here 

# close the output files 
foreach my $data (values %files) { 
    close $data->{fh}; 
}