2013-07-10 58 views
-3

我有一個perl腳本,但它只在給出序列時才計算分子量。但是我想計算fasta文件中蛋白質序列的分子量。perl計算分子量

print "Enter the amino acid sequence:\n"; 
$a = <STDIN> ; 
chomp($a); 
my @a =(); 
my $a = ''; 
$x = length($a); 
print "Length of sequence is : $x"; 
@a = split('', $a); 
$b = 0; 
my %data = ( 
    A=>71.09, R=>16.19, D=>114.11, N=>115.09, 
    C=>103.15, E=>129.12, Q=>128.14, G=>57.05, 
    H=>137.14, I=>113.16, L=>113.16, K=>128.17, 
    M=>131.19, F=>147.18, P=>97.12, S=>87.08, 
    T=>101.11, W=>186.12, Y=>163.18, V=>99.14 
); 
foreach $i(@a) { 
    $b += $data{$i}; 
} 
$c = $b - (18 * ($x - 1)); 
print "\nThe molecular weight of the sequence is $c";    
+1

您可能想要更好地格式化您的代碼以幫助我們閱讀它。看看如何在這裏:http://stackoverflow.com/help/formatting –

+0

這是否也有幫助http://stackoverflow.com/questions/9748858/reading-fasta-sequences-to-extract-nucleotide-data-and-然後寫一個tabde? –

+1

你的問題到底是什麼? – ruakh

回答

1

首先你必須告訴我們什麼格式有.fasta文件。我所知,他們看起來像

>seq_ID_1 descriptions etc 
ASDGDSAHSAHASDFRHGSDHSDGEWTSHSDHDSHFSDGSGASGADGHHAH 
ASDSADGDASHDASHSAREWAWGDASHASGASGASGSDGASDGDSAHSHAS 
SFASGDASGDSSDFDSFSDFSD 

>seq_ID_2 descriptions etc 
ASDGDSAHSAHASDFRHGSDHSDGEWTSHSDHDSHFSDGSGASGADGHHAH 
ASDSADGDASHDASHSAREWAWGDASHASGASGASG 

如果我們將建議你的代碼工作正常,並計數分子量所有我們需要的是讀取FASTA文件,分析它們和你的代碼計算權重。聽起來很容易。

#!/usr/bin/perl 

use strict; 
use warnings; 
use Encode; 


for my $file (@ARGV) { 
    open my $fh, '<:encoding(UTF-8)', $file; 
    my $input = join q{}, <$fh>; 
    close $fh; 
    while ($input =~ /^(>.*?)$([^>]*)/smxg) { 
     my $name = $1; 
     my $seq = $2; 
     $seq =~ s/\n//smxg; 
     my $mass = calc_mass($seq); 
     print "$name has mass $mass\n"; 
    } 
} 

sub calc_mass { 
    my $a = shift; 
    my @a =(); 
    my $x = length $a; 
    @a = split q{}, $a; 
    my $b = 0; 
    my %data = (
     A=>71.09, R=>16.19, D=>114.11, N=>115.09, 
     C=>103.15, E=>129.12, Q=>128.14, G=>57.05, 
     H=>137.14, I=>113.16, L=>113.16, K=>128.17, 
     M=>131.19, F=>147.18, P=>97.12, S=>87.08, 
     T=>101.11, W=>186.12, Y=>163.18, V=>99.14 
    ); 
    for my $i(@a) { 
     $b += $data{$i}; 
    } 
    my $c = $b - (18 * ($x - 1)); 
    return $c; 
} 
+0

所以如果我包括你的代碼與我的perl腳本,那麼它會做我需要的 – user2503701

+0

一件事...我的Perl腳本通過氨基酸序列,但我需要腳本從文件 – user2503701

+0

讀取此代碼從文件中讀取序列。用法:'perl /path/to/script.pl /path/to/input.file/path/to/other.file' – Suic