Perl：從文件

導入文本，其中包含ÅÄÖ，我最終想要實現的是將文件中的所有小寫字符轉換爲大寫，並將它們寫入終端。Perl：從文件

use utf8; 
binmode STDOUT, ":utf8"; 

$text = "ABCÅÄÖ\n"; 

$text =~ tr/A-Ö/a-ö/; 
print $text;

輸出：

abcåäö

正如預期的那樣。

但是，當我嘗試從文件中導入相同的文本，它變得瘋狂。

use utf8; 
binmode STDOUT, ":utf8"; 

open FILE, $filename or die "An error occurred while reading the file: $!"; 
$text = join '', <FILE>; 
close FILE or die "An error occurred while closing the file: $!"; 

$text =~ tr/A-Ö/a-ö/; 
print $text;

輸出

ABCÃÃÃ

我假設導入的文本不正確編碼。任何人都知道如何在導入時對文本進行編碼？

在此先感謝。

傑克

來源

2014-11-24 Jack Pettersson

你沒有告訴Perl的解碼文件。

use strict; 
use warnings; 

use utf8;        # Source code is UTF-8. 
use open ':std', ':encoding(UTF-8)'; # Terminal and files are UTF-8. 

my $qfn = ...; 

open(my $fh, '<', $qfn) 
    or die("Can't open file $qfn: $!\n"); 

my $text = do { local $/; <$fh> }; 
print(lc($text));

來源

2014-11-24 15:12:14 ikegami

這個工作更好，與tr和所有:) – 2014-11-24 15:52:21

那麼，依靠'$ text =〜tr/A-Ö/ a-ö/'是不安全的。使用'lc（$ text）'或'$ text =〜s /（[A-ZÅÄÖ]）/ \ L $ 1/g'。我做了一些其他的改進（詞法代替全局變量，3-arg打開，包括錯誤消息中的文件名，...） – ikegami 2014-11-24 16:39:03

只要告訴Perl的是什麼編碼的文件是：

open FILE, '<:utf8', $filename or die $!;

或者，如果你想查詢的編碼，使用

open FILE, '<:encoding(UTF-8)', $filename or die $!;

來源

2014-11-24 15:12:02 choroba

Odd ...當我使用這種方法的導入是好的（IE我可以打印文本很好），但是當它翻譯時，ÅÄ和Ö仍然是混亂的。 – 2014-11-24 15:30:48

@JackPettersson：嘗試'lc'而不是'tr'。 – choroba 2014-11-24 15:32:53

已解決，謝謝！ :) – 2014-11-24 15:42:02

回答

相關問題