2012-02-27 47 views
4

是否可以將文件中的記錄直接加載到散列中?記錄以/ begin和/ end結尾,並且具有固定的內容順序。Perl - 將文件中的記錄加載到散列表

我要的是填充像這樣的哈希:對「slurped_record」條目

hash_city{London}{slurped_record}='/begin CITY London\n big\n England\n Sterling\n/end CITY' 
hash_city{Paris}{slurped_record}='/begin CITY\n Paris\n big\n France\n Euro\n/end CITY' 
hash_city{Melbourne}{slurped_record}='/begin CITY\n\n Melbourne\n big\n Australia\n Dollar\n hot\n/end CITY' 

然後我就可以去中斷過程中的散列等記錄。(原因是以後我要添加新的密鑰說倫敦一樣,「國家=英格蘭等

hash_city{London}{Country}='England' 

我已經成功地實現的東西,通過啜而不是讀文件中的行由行。匹配在工作/開始,建立一個記錄($ rec。= $ _),然後匹配a/end和處理。這有點亂,想知道是否有一個更優雅的Perl的方法..

我的代碼試圖到目前爲止如下:

use strict; 
use warnings; 
use Data::Dumper; 

my $string = do {local $/; <DATA>}; 
my %hash_city = map{$2=>$1} $string =~ /(\/begin\s+CITY\s+(\w+).+\/end\s+CITY)/smg; 
print Dumper(%hash_city); 

__DATA__ 
stuff 
stuff 
/begin CITY London 
    big 
    England 
    Sterling 
/end CITY 

stuff 
stuff 

/begin CITY 
    Paris 
    big 
    France 
    Euro 
/end CITY 
stuff 

/begin CITY 

    Melbourne 
    big 
    Australia 
    Dollar 
    hot 
/end CITY 

stuff 
+0

你啜食生成的文件內容一分式兩份,並更好地寫作'我$串; {local $ /; $ string = ;}'。 – Borodin 2012-02-28 02:14:49

回答

3

了一個小程序,以顯示周圍的其他方式,推進您的進程也是如此。 )不知道是否優雅,但我認爲它完成了工作。 )

my %city_record; 

## we're going to process the input file in chunks. 
## here we define the chunk start marker, and make Perl to separate file input with it 
local $/ = "/begin CITY"; 

# ignoring anything before the first section starts 
scalar <DATA>; 

while (<DATA>) { 
    # throwing out anything after the section end marker 
    # (might be done with substr-index combo as well, 
    # but regex way was shorter and, for me, more readable as well) 
    my ($section_body) = m{^(.+)/end CITY}ms; 

    # now we're free to parse the section_body as we want. 
    # showing here pulling city name - and the remaining data, by using the split special case 
    my ($city, @city_data) = split ' ', $section_body; 

    # filling out all the fields at once 
    # (may seem a bit unusual, but it's a simple hash slice actually, great Perl idiom) 
    @{ $city_record{$city} }{qw/ size country currency misc /} = @city_data; 
} 

# just to test, use something of yours instead.) 
print Dumper \%city_record; 
+0

對不起,延遲退回,但感謝您花時間在這裏回覆。這是一個很好的答案,並且對我的Perl腳本有很多幫助,不僅在這個例子中,而且還有我解析的其他文件。再次感謝。 – Chris 2012-03-06 22:49:11

1

你或許可以利用的flip-flop operator/FROM/ .. /TO/。您可以使用不同的分隔符來使正則表達式更具可讀性。我在下面使用m#^/begin ...#。提取城市名稱很簡單,假設標題和城市名稱之間只有空格。我正在使用\S(非空格),因爲您不想錯過名稱中包含非字母數字的城市名稱,如「Foo-Bar」或「St.Tropez」。

如果您確實找到包含空格的城市名稱,則可能需要找出更好的正則表達式來查找城市名稱。我將把它作爲一個練習。

use strict; 
use warnings; 
use Data::Dumper; 

my %hash; 
my $string; 
while (<DATA>) { 
    if (m#^/begin CITY# .. m#^/end CITY#) { 
     $string .= $_; 
     if (m#^/end CITY#) { 
      my ($city) = $string =~ m#^/begin CITY\s*(\S+)#; 
      $hash{$city}{slurp} = $string; 
      $string = ""; 
     } 
    } 
} 
$Data::Dumper::Useqq=1; 
print Dumper(\%hash); 
+0

*「含有空白的奇怪名稱」*如「聖達菲」,「鹽湖城」,「巴吞魯日」...... – Borodin 2012-02-28 02:19:27

+0

@Borodin'紐約','華盛頓特區','吉隆坡'。是的,現在我可以想到它們中的大部分,但是當我寫出答案時,我完全消隱了。 – TLP 2012-02-28 04:39:47

0

這會給你一個hash with all cities and their properties

my %cities = map { 
    my($name, @data, %props) = (split ' '); 
    @props{qw(Size Country Currency Temperature)} = @data; 
    $name => \%props 
} $string =~ m| 
    ^/begin \s+ CITY 
    (.+?) 
    ^/end \s+ CITY 
|gsmx; 

print Dumper(\%cities);