2016-06-27 58 views
0

我可以在目錄中打開一個文件並運行以下代碼。但是,當我嘗試在目錄中的多個文件上使用相同的代碼時,出現沒有文件的錯誤。在目錄中打開多個文件時「無此文件」,但僅打開一個文件時無錯誤

我試圖確保我正確命名文件,它們的格式正確,它們位於我當前的工作目錄中,並且正確引用了所有內容。

我知道很多人以前都有這個錯誤,並且發佈了類似的問題,但是我們將不勝感激。

工作代碼:

#!/usr/bin/perl 

use warnings; 
use strict; 
use diagnostics; 

use List::Util qw(min max); 

my $RawSequence = loadSequence("LDTest.fasta"); 
my $windowSize = 38; 
my $stepSize = 1; 
my %hash; 
my $s1; 
my $s2; 
my $dist; 

for (my $windowStart = 0; $windowStart <= 140; $windowStart += $stepSize) { 

    my $s1 = substr($$RawSequence, $windowStart, $windowSize); 
    my $s2 = 'CGGAGCTTTACGAGCCGTAGCCCAAACAGTTAATGTAG'; 
      # the 28 nt forward primer after the barcode plus the first 10 nt of the mtDNA dequence 

    my $dist = levdist($s1, $s2); 

    $hash{$dist} = $s1; 

    #print "Distance between '$s1' and '$s2' is $dist\n"; 

    sub levdist { 
     my ($seq1, $seq2) = (@_)[ 0, 1 ]; 

     my $l1 = length($s1); 
     my $l2 = length($s2); 
     my @s1 = split '', $seq1; 
     my @s2 = split '', $seq2; 
     my $distances; 

     for (my $i = 0; $i <= $l1; $i++) { 
      $distances->[$i]->[0] = $i; 
     } 

     for (my $j = 0; $j <= $l2; $j++) { 
      $distances->[0]->[$j] = $j; 
     } 

     for (my $i = 1; $i <= $l1; $i++) { 

      for (my $j = 1; $j <= $l2; $j++) { 
       my $cost; 

       if ($s1[ $i - 1 ] eq $s2[ $j - 1 ]) { 
        $cost = 0; 
       } 
       else { 
        $cost = 1; 
       } 

       $distances->[$i]->[$j] = minimum(
        $distances->[ $i - 1 ]->[ $j - 1 ] + $cost, 
        $distances->[$i]->[ $j - 1 ] + 1, 
        $distances->[ $i - 1 ]->[$j] + 1, 
       ); 
      } 
     } 

     my $min_distance = $distances->[$l1]->[$l2]; 

     for (my $i = 0; $i <= $l1; $i++) { 
      $min_distance = minimum($min_distance, $distances->[$i]->[$l2]); 
     } 

     for (my $j = 0; $j <= $l2; $j++) { 
      $min_distance = minimum($min_distance, $distances->[$l1]->[$j]); 
     } 

     return $min_distance; 
    } 
} 

sub minimum { 
    my $min = shift @_; 

    foreach (@_) { 
     if ($_ < $min) { 
      $min = $_; 
     } 
    } 

    return $min; 
} 

sub loadSequence { 
    my ($sequenceFile) = @_; 
    my $sequence = ""; 

    unless (open(FASTA, "<", $sequenceFile)) { 
     die $!; 
    } 

    while (<FASTA>) { 
     my $line = $_; 
     chomp($line); 

     if ($line !~ /^>/) { 
      $sequence .= $line; #if the line doesn't start with > it is the sequence 
     } 
    } 

    return \$sequence; 
} 

my @keys = sort { $a <=> $b } keys %hash; 
my $BestMatch = $hash{ keys [0] }; 

if ($keys[0] < 8) { 
    $$RawSequence =~ s/\Q$BestMatch\E/CGGAGCTTTACGAGCCGTAGCCCAAACAGTTAATGTAG/g; 
    print ">|Forward|Distance_of_Best_Match: $keys[0] |Sequence_of_Best_Match: $BestMatch", "\n", 
      "$$RawSequence", "\n"; 
} 

這裏是我的非工作代碼的縮寫版本。這並沒有改變我不包括的東西:

頁眉和全局:

my $dir   = ("/Users/roblogan/Documents/FakeFastaFiles"); 
my @ArrayofFiles = glob "$dir/*.fasta"; 

foreach my $file (@ArrayofFiles) { 

    open(my $Opened, $file) or die "can't open file: $!"; 

    while (my $OpenedFile = <$Opened>) { 

     my $RawSequence = loadSequence($OpenedFile); 

     for (...) { 

      ...; 

      print 
        ">|Forward|Distance_of_Best_Match: $keys[0] |Sequence_of_Best_Match: $BestMatch", 
        "\n", "$$RawSequence", "\n"; 
     } 
    } 
} 

確切的錯誤是:

Uncaught exception from user code: 
     No such file or directory at ./levenshtein_for_directory.pl line 93, <$Opened> line 1. 
    main::loadSequence('{\rtf1\ansi\ansicpg1252\cocoartf1404\cocoasubrtf470\x{a}') called at ./levenshtein_for_directory.pl line 22 

線93:

 89 sub loadSequence{ 
    90   my ($sequenceFile) = @_; 
    91   my $sequence = ""; 
    92   unless (open(FASTA, "<", $sequenceFile)){ 
    93     die $!; 
    94   } 

線22:

 18   foreach my $file (@ArrayofFiles) { 
    19    open (my $Opened, $file) or die "can't open file: $!"; 
    20    while (my $OpenedFile = <$Opened>) { 
    21 
    22     my $RawSequence = loadSequence($OpenedFile); 
    23 
+0

您的循環似乎很困惑:對於@ @ ArrayofFiles中的每個文件名,您打開該文件,然後從中讀取每一行。然後你把每一行(完成後面的換行符,如果有的話)傳遞給'loadSequence',它將該字符串視爲文件名並嘗試打開它。那真的是你想要做的嗎? – tfb

+2

這與問題無關,但您應該將'loadSequence'改爲'return $ sequence'而不是'return \ $ sequence'。目前的形式沒有優勢,這意味着你必須在每個人都感到困惑的地方用'$$ RawSequence'來胡亂代碼,[就像你在上一個問題中發現的那樣](http://stackoverflow.com/questions/38044668/perl-variable-is -print-as-scalar0x7faf2b804240) – Borodin

+0

只需要說一句:如果這些FASTA文件不僅僅是一小撮行,還可以通過在CPAN上使用Levenshtein零件上的一個XS模塊使您的生活變得更加輕鬆。我有點偏離[Text :: Levenshtein :: Flexible](https://metacpan.org/pod/Text::Levenshtein::Flexible),但任何XS的都比純Perl快幾個數量級。 – mbethke

回答

2

我剛剛瞭解到「FASTA文件」是一個解決的術語。沒有意識到這一點,以前認爲他們是一些文件,幷包含文件名或其他東西。正如@zdim所說,你打開這些文件兩次。

下面的代碼獲取FASTA文件列表(只有文件名),然後用每個這樣的文件名稱調用loadSequence。該子程序然後打開給定的文件,將none-^>行連接到一條大行並返回它。

# input: the NAME of a FASTA file 
# return: all sequences in that file as one very long string 
sub loadSequence 
{ 
    my ($fasta_filename) = @_; 
    my $sequence = ""; 
    open(my $fasta_fh, '<', $fasta_filename) or die "Cannot open $fasta_filename: $!\n"; 
    while (my $line = <$fasta_fh>) { 
     chomp($line); 
     if ($line !~ /^>/) { 
      $sequence .= $line; #if the line doesn't start with > it is the sequence 
     } 
    } 
    close($fasta_fh); 
    return $sequence; 
} 

# ... 

my $dir = '/Users/roblogan/Documents/FakeFastaFiles'; 
my @ArrayofFiles = glob "$dir/*.fasta"; 
foreach my $filename (@ArrayofFiles) { 
    my $RawSequence = loadSequence($filename); 
    # ... 
} 
1

您似乎試圖打開文件兩次。該行

my @ArrayofFiles = glob "$dir/*.fasta"; 

給你的文件列表。然後

foreach my $file (@ArrayofFiles){ 
    open (my $Opened, $file) or die "can't open file: $!"; 
    while (my $OpenedFile = <$Opened>) { 
     my $RawSequence = loadSequence($OpenedFile); 
     # ... 

按行執行以下操作。它遍歷文件,打開每個文件,從中讀取一行,然後將提交給函數loadSequence()

然而,在該函數試圖再次打開

sub loadSequence{ 
    my ($sequenceFile) = @_; 
    my $sequence = ""; 
    unless (open(FASTA, "<", $sequenceFile)){ 
    # ... 

函數中的$sequenceFile變量傳遞給函數作爲$OpenedFile文件 - 這是已經打開的文件中的一條線,正在讀取,而不是文件名。雖然我不確定您的代碼的細節,但您顯示的錯誤似乎與此一致。

這可能是因爲您將glob(給出文件列表)與opendir混淆在一起,這確實需要以下readdir才能訪問這些文件。 嘗試將$OpenedFile重命名爲$line(它是),然後查看它的外觀。

相關問題