我想從@amplicon_exon
陣列中檢索包含類似元素(,如)到@failedamplicons
陣列的元素。 @failedamplicons
中的每個元素都是唯一的,只能匹配@amplicon_exon
中的單個元素。我已經嘗試了兩個for循環,但獲得重複值。有沒有更好的方法來查找和檢索兩個數組中的相似值?Perl從兩個陣列中找到類似的元素
@failedamplicons: example:
OCP1_FGFR3_8.87
OCP1_AR_14.89
@amplicon_exon: example:
TEST_Focus_ERBB2_2:22:ERBB2:GENE_ID=ERBB2;PURPOSE=CNV,Hotspot;CNV_ID=ERBB2;CNV_HS=1
OCP1_FGFR3_8:intron:FGFR3:GENE_ID=FGFR3;PURPOSE=CNV;CNV_ID=FGFR3;CNV_HS=1
OCP1_CDK6_14:intron:CDK6:GENE_ID=CDK6;PURPOSE=CNV;CNV_ID=CDK6;CNV_HS=1
下面是兩個用於循環代碼:
my $i = 0;
my $j = 0;
for ($i = 0; $i < @amplicon_exon; $i++) {
for ($j = 0; $j < @failedamplicons; $j++) {
my $fail_amp = (split /\./, $failedamplicons[$j])[0];
#print "the failed amp before match is $fail_amp\n";
if (index($amplicon_exon[$i], $fail_amp) != -1) {
#print "the amplicon exon that matches $amplicon_exon[$i] and sample is $sample_id\n";
print "the failed amp that matches $fail_amp and sample is $sample_id\n";
my @parts = split /:/, $amplicon_exon[$i];
my $exon_amp = $parts[1];
next unless $parts[3] =~ /Hotspot/; #includes only Hotspot amplicons
my $gene_res = $parts[2];
my $depth = (split /\./, $failedamplicons[$j])[1];
my @total_amps = (
$run_name, $sample_id, $gene_res, $depth, $fail_amp, $run_date, $matrix_status
);
my $lines = join "\t", @total_amps;
push(@finallines, $lines);
}
}
}
你能提供的是「_similar_」一個精確的標準? – zdim
amplicon_exon元素必須包含「。」之前的完整字符串failureamplicons元素。例如:'OCP1_FGFR3_8:內含子:FGFR3:GENE_ID = FGFR3;目的= CNV; CNV_ID = FGFR3; CNV_HS = 1'包含'OCP1_FGFR3_8'謝謝 – user3781528
@ user3781528:我已經整理好您的Perl代碼,以便我可以讀取它。請在將來發布易讀的代碼。 – Borodin