我想循環遍歷一個數組,遞增200.我在該200中創建一個子數組,然後將該子數組上的計算移動到下一組200並重復。但是,每當我進行第二次計算時,新的子數組就包含第一個子數組的值。這就像我不能重置或undef
子陣列在for
循環的下一次迭代中重新使用它。Perl用於循環和作用域變量
完整的代碼可以在GitHub:https://github.com/bsima/yeast-TRX
運行的代碼,你需要以下基因組文件在同一目錄的腳本:https://raw.github.com/bsima/yeast-TRX/master/data/bayanus/genome.csv
下面是相關代碼:
#!/usr/bin/perl
use strict;
use warnings;
use diagnostics;
use Statistics::Descriptive;
my $species = "bayanus";
open(SPECIES, "<./genome.csv") || die "Cannot open file: $!\n";
my @text = <SPECIES>;
my $geneNameRe = qr/(eY\w{5}[CW])/;
my $geneRe = qr/,([atgcATGC-]+)/;
foreach my $line (@text) {
chomp($line);
if (defined $line && $line =~ m/e.{4,},[acgtACGT-]+/) {
my $gene = match($geneRe,$line);
my $geneName = match($geneNameRe,$line);
# Smoothing function
#
# This moves through the gene data and counts the position until it arrives
# at the end of the smoothing window (e.g. 200). Then it calculates the average
# of the selected data set and outputs it into the respecive `smooth.csv` file
# in the following format, to be read later by R's graphing functions:
# gene,position,trx.mean,energy.mean
open(my $smooth, ">>./smooth.csv") || die "Cannot open file $!";
my $smoothingWindow = 200;
# Loop through every 200 characters
for (my $smoothing = 0; $smoothing < length($gene); $smoothing=$smoothing+$smoothingWindow) {
my @trxValues;
my @energyScores;
# Loop through every character
for (my $position = 0; $position <= $smoothingWindow; $position++) {
my $dinucleotide = substr($gene,$position,2);
if ($dinucleotide =~ m/[actgACTG]{2}/) {
my $trxValue = trxScore($dinucleotide);
my $energyScore = energyScore($dinucleotide);
#print "trxValue = $trxValue\n";
push @trxValues, $trxValue;
push @energyScores, $energyScore;
}
}
# Now run the calculations and print to SMOOTH
my $trxStat = Statistics::Descriptive::Full->new();
$trxStat->add_data(@trxValues);
my $trxMean = $trxStat->mean();
my $energyStat = Statistics::Descriptive::Full->new();
$energyStat->add_data(@energyScores);
my $energyMean = $energyStat->mean();
print "$geneName $smoothing: TRX Values @trxValues\n";
print "$geneName $smoothing: TRX Mean is $trxMean.\n";
print $smooth $geneName . "," . $smoothing . "," . $trxMean . "," . $energyMean . "\n";
}
close $smooth;
}
}
# This just makes it easy to do regex matches
sub match {
my ($re, $text) = @_;
if ($text =~ $re) {
return $1;
}
}
# @name trxScore
# @description Calculates and returns the TRX value of a given phosphate linkage
# @param $dinucleotide {string} The nucleotide to be checked
# @return {integer} The TRX value
sub trxScore {
my ($dinucleotide) = @_;
my %trxScores = (
qr/(CG)/ => 43,
qr/(CA)/ => 42,
qr/(TG)/ => 42,
qr/(GG)/ => 42,
qr/(CC)/ => 42,
qr/(GC)/ => 25,
qr/(GA)/ => 22,
qr/(TC)/ => 22,
qr/(TA)/ => 14,
qr/(AG)/ => 9,
qr/(CT)/ => 9,
qr/(AA)/ => 5,
qr/(TT)/ => 5,
qr/(AC)/ => 4,
qr/(GT)/ => 4,
qr/(AT)/ => 0,
qr/(.-)/ => 0,
qr/(-.)/ => 0 # These last two are necessary for dealing with missing values in the genomes
);
foreach my $re (keys %trxScores) {
if (match($re,$dinucleotide)) {
return $trxScores{$re};
}
}
return 0;
}
# @name energyScore
# @description Calculates and returns the delta-E value of a given phosphate linkage
# @param $dinucleotide {string} The nucleotide to be checked
# @return {integer} The delta-E value
sub energyScore {
my ($dinucleotide) = @_;
my %energyScores = (
qr/(AA)/ => -18.5,
qr/(AC)/ => -19.0,
qr/(AG)/ => -23.6,
qr/(AU)/ => -15.7,
qr/(CA)/ => -20.0,
qr/(CC)/ => -21.4,
qr/(CG)/ => -26.9,
qr/(CU)/ => -17.2,
qr/(GA)/ => -23.7,
qr/(GC)/ => -22.9,
qr/(GG)/ => -24.3,
qr/(GU)/ => -18.9,
qr/(UA)/ => -19.6,
qr/(UC)/ => -28.2,
qr/(UG)/ => -23.3,
qr/(UU)/ => -15.8,
qr/(.-)/ => 0,
qr/(-.)/ => 0 # These last two are necessary for dealing with missing values in the genomes
);
foreach my $re (keys %energyScores) {
if (match($re,$dinucleotide)) {
return $energyScores{$re};
}
}
return 0;
}
如果運行大型腳本代碼塊的一部分,將打印的@trxValues
內容到終端,你會看到$smoothing
變量遞增正確,所以for
循環的步驟部分沒有任何問題,但@trxValues
的內容不會更改。因此,在第一個for
循環結束和第二個循環開始之間的某個位置,數組@trxValues
未被正確清除。同樣的事情發生在@energyScores
。
對此有何意見?我嘗試在for
循環的末尾使用undef
,並在循環結束時將@trxValues
設置爲零,但兩者均無效。我完全被難住了。
編輯:將代碼塊更改爲可自行運行。
你需要顯示最小的工作示例,我不認爲有人想從github運行你的代碼 – Suic
這遠遠超過相關的代碼。你能否把你發佈的內容翻譯成實際相關的內容?這是調試你不明白的問題的第一步,所以你應該在任何情況下這樣做! – ikegami
你發佈的內容甚至不是可運行的,這意味着你必須格外清楚,首先要說明爲什麼你認爲它沒有被清除。你發佈的內容都沒有表明你甚至檢查過。 – ikegami