2015-01-09 82 views
1

在電子表格中有一列包含數字ID和註釋。例如:perl:將一列分成兩列

529120 30S ribosomal protein S3 

我想該列分成兩個,其中第一列包含數字ID(529120)和第二列包含註解(30S核糖體蛋白S3)。

我到目前爲止的代碼只打印出第一列的數字ID然後終止。

#!/usr/bin/perl 
use strict; 
use warnings; 

my $annotationsFile = "/Users/mycomputer/Desktop/AnnotationsSplit.tsv"; 

    open(ANNOTATIONS, "<", $annotationsFile) 
     or die "Cannot open file $!"; 

     while (my $line = <ANNOTATIONS>) { 
     chomp $line; 
     my @column  = split(/\t/, $line); 
     my $annotationFull = $column[3]; 
     my ($annotationNumber) = $annotationFull =~ (/^(\d+)/); 
     print $annotationNumber, "\n"; 
} 

回答

4

split與LIMIT = 2:

use warnings; 
use strict; 

while (my $line = <DATA>) { 
    chomp $line; 
    my ($id, $annot) = split /\s+/, $line, 2; 
    print "id = $id\n"; 
    print "annot = $annot\n"; 
} 

__DATA__ 
529120 30S ribosomal protein S3 

輸出:

id = 529120 
annot = 30S ribosomal protein S3 
+0

謝謝!我怎樣才能把它寫到一個新的tsv文件? – Bex 2015-01-09 21:58:30

+0

@Bex:不客氣。打開一個文件輸出,並打印文件句柄。 – toolic 2015-01-10 00:29:31