2015-05-08 29 views
5

我有一個2Gb CSV文件,其中列1包含時間段,第二列包含10000行XML文件(作爲一行)。在Perl中忽略'Unclosed Token'

我想遍歷此CSV的每一行,並將第二列XML保存到自己的文件中。我還使用XPath從XML文件中獲取客戶名稱,以便我可以將該文件命名爲[CustomerName]-[time from Column 1].xml。但是,某些XML文件不是有效的XML,並且出現Unclosed Token on Line ...錯誤。有沒有辦法忽略該消息,並讓它跳過該文件?以下是我的Perl代碼:

my $file = '../FILENAME.csv'; 
open my $info, $file or die "Could not open $file: $!"; 
my $count = 0; 
$| = 1; 

while(my $line = <$info>) { 
    $count++; if($count == 1) {next;} #Ignore headers 
    $line =~ /(\d+),"(.*?)"$/; #Load time into $1, XML file into $2 
    my $time = $1; 
    my $report = $2; 
    $report =~ s/""/"/g; #Replace "" with " 
    my $xp = XML::XPath->new(xml => $report); 
    my $ext = $xp->getNodeText('/report/customer') . "-" . $time . ".xml"; #Generate filename with customer name and time 
    write_file($ext, $report); 
} 
close $info; 

我也樂於提出建議,以提高效率。

回答

4

您可以嘗試在eval內附上煩人的代碼。例如:

eval { 
    my $xp = XML::XPath->new(xml => $report); 
    my $ext = $xp->getNodeText('/report/customer') . "-" . $time . ".xml"; #Generate filename with customer name and time 
    write_file($ext, $report); 
}; 
if ([email protected]) { 
    printf "ERROR: [email protected]"; 
} 

下面的代碼:

$count++; if($count == 1) {next;} #Ignore headers 
$line =~ /(\d+),"(.*?)"$/; #Load time into $1, XML file into $2 
my $time = $1; 
my $report = $2; 

可以縮短爲:

next if ++$count == 1; #Ignore headers 
my ($time, $report) = ($line =~ /(\d+),"(.*)"$/); # time, XML file 
+0

難道是更有效地把'WRITE_FILE()'後,如果聲明?我的錯誤通常來自XML處理,而不是寫入文件 – Bijan

+1

它僅取決於在XML處理中出現錯誤時是否要調用write_file。如果你想這樣做,'$ ext'的聲明需要放在'eval'之前,可能初始化爲空字符串。 – tivn

+0

你說得對。我從打印中刪除了'$ @',因爲它打印了40000個字符以顯示錯誤,並且錯誤消息對我來說並不重要。謝謝! – Bijan