我在這裏得到了很好的輸入,對一串核苷酸進行3個核苷酸重複模式的搜索,要求通過爲它構建正則表達式連續7次重複發生。以條件形式輸入多個正則表達式?
my $regex1 = qr/(([ACGT]{3}) \2{6,})/x;
我知道如何將它擴大到在7
連續搜索2個nucs在10行以及4個,但我想擴大代碼,以便用戶可以指向他們的輸入文件,它會檢查上面的正則表達式以及另外兩個正則表達式,我需要爲其他兩個搜索創建這兩個正則表達式。
編輯:如何使我的輸入文件受到像上面那樣的多個正則表達式?我的代碼(由哈希符號淘汰)
這裏是我當前的代碼
print "Please specify the file location (DO NOT DRAG/DROP files!) then press ENTER:\n";
$seq = <STDIN>;
#Remove the newline from the filename
chomp $seq;
#open the file or exit
open (SEQFILE, $seq) or die "Can't open '$seq': $!";
#read the dna sequence from the file and store it into the array variable @seq1
@seq1 = <SEQFILE>;
#Close the file
close SEQFILE;
#Put the sequence into a single string as it is easier to search for the motif
$seq1 = join('', @seq1);
#Remove whitespace
$seq1 =~s/\s//g;
#Count of number of nucleotides
#Initialize the variable
$number = 0;
$number = length $seq1;
#Use regex to say "Find 3 nucelotides and match at least 6 times
# qr(quotes and compiles)/(([nucs]{number of nucs in pattern}) \2{number of repeats,}/x(permit within pattern)
my $regex1 = qr/(([ACGT]{3}) \2{6,})/x;
#my $regex = qr/(([ACGT]){2}) \2{9,})/x;
#my $regex2 = qr/(([ACGT]{4}) \2{6,})/x;
#Tell program to use $regex on variable that holds the file
$seq1 =~ $regex1;
#Now print the results to screen
#This will need to change to printing to a file (WHAT KIND OF FILE?)in the following manner :site, nucelotide match, # of times, length of full sequence
printf "MATCHED %s exactly %d times\n", $2, length($1)/3;
print "Length of sequence: $number\n";
exit;
你的問題是什麼? – Borodin 2013-02-23 03:12:38
對不起 - 我不清楚。我想讓我的輸入文件受到多個正則表達式的影響。 – Citizin 2013-02-23 03:32:54