2012-09-11 57 views
-3

給定perl腳本將輸入序列剪切爲「E」並跳過@nobreak中提到的「E」的特定位置,並生成一個片段數組作爲輸出。但是我想要一個腳本,它會在輸出中爲每個已經跳過@nobreak所有位置的位置生成一組這樣的數組。假設組1包含在「E」37處跳過,在「E」45處跳過後的組2,等等。下面提到的我寫的腳本無法正常工作。我想在輸出中一次生成4個不同的數組,每次取@nobreak的一個位置。請幫忙!在perl中生成數組集合

my $s = 'MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVEALYLVCGERGFFYTPKTRREAEDLQVGQVELGGGPGAGSLQPLALEGSLQKRGIVEQCCTSICSLYQLENYCN'; 

print "Results of 1-Missed Cleavage:\n\n"; 

my @nobreak = (37, 45, 57, 59); 
{ 
    @nobreak = map { $_ - 1 } @nobreak; 

    foreach (@nobreak) { 

     substr($s, $_, 1) = "\0"; 
    } 
    my @a = split /E(?!P)/, $s; 
    $_  =~ s/\0/E/g foreach (@a); 
    $result = join "E,", @a; 
    @final = split /,/, $result; 
    print "@final\n"; 
} 
+1

添加您預計會在破譯你的要求幫助輸出。 @ysth可能會對,但我不確定... – pmakholm

回答

1

循環@nobreak?

my $s = 'MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVEALYLVCGERGFFYTPKTRREAEDLQVGQVELGGGPGAGSLQPLALEGSLQKRGIVEQCCTSICSLYQLENYCN'; 
print "Results of 1-Missed Cleavage:\n\n"; 
my @nobreak = (37,45,57,59); 
for my $nobreak (@nobreak) { 
    substr($s, $nobreak-1, 1) = "\0"; 
    my @a = split(/E(?!P)/, $s); 
    substr($s, $nobreak-1, 1) = 'E'; 
    $_ =~ s/\0/E/g foreach (@a); 
    $result = join ("E,", @a); 
    @final = split(/,/, $result); 
    print "@final\n"; 
} 
2

要拆分在每個「E」的字符串,而不在這個過程中消耗它,使用回顧後:

my @final = split /(?<=E)/, $str; 

要斷言更好地控制在其上「E」分裂(這你離開未指定),則將對正則表達式進行更改。

在需要獲得一個變量回顧後,可以使用\K ...

0

它看起來像你想後所有E字符的字符串分裂,但之前任何P字符

這段代碼將做你想做的。它通過將在@nobreak中的每個偏移量處的E改變爲e(比用於調試的"\0"好得多)並且在/(?<=E)(?!P)/上分開 - 即在E之後而不是在P之前。該e改回爲一個E之後使用tr/e/E/

use strict; 
use warnings; 

my $s = 'MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVEALYLVCGERGFFYTPKTRREAEDLQVGQVELGGGPGAGSLQPLALEGSLQKRGIVEQCCTSICSLYQLENYCN'; 

print "Results of 1-Missed Cleavage:\n\n"; 

my @nobreak = (37, 45, 57, 59); 

for my $index (@nobreak) { 
    my $ss = $s; 
    substr($ss, $index-1, 1) = 'e'; 
    my @final = split /(?<=E)(?!P)/, $ss; 
    tr/e/E/ for @final; 
    print "$_\n" for @final; 
    print "\n"; 
} 

輸出

Results of 1-Missed Cleavage: 

MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVEALYLVCGE 
RGFFYTPKTRRE 
AE 
DLQVGQVE 
LGGGPGAGSLQPLALE 
GSLQKRGIVE 
QCCTSICSLYQLE 
NYCN 

MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVE 
ALYLVCGERGFFYTPKTRRE 
AE 
DLQVGQVE 
LGGGPGAGSLQPLALE 
GSLQKRGIVE 
QCCTSICSLYQLE 
NYCN 

MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVE 
ALYLVCGE 
RGFFYTPKTRREAE 
DLQVGQVE 
LGGGPGAGSLQPLALE 
GSLQKRGIVE 
QCCTSICSLYQLE 
NYCN 

MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVE 
ALYLVCGE 
RGFFYTPKTRRE 
AEDLQVGQVE 
LGGGPGAGSLQPLALE 
GSLQKRGIVE 
QCCTSICSLYQLE 
NYCN