除非你喜歡疼痛,否則使用Text::CSV
及其親屬Text::CSV_XS
和Text::CSV_PP
。
但是,這可能是這個問題的更容易的部分。一旦閱讀並驗證了該行是完整的,則需要將相關信息添加到正確鍵入的散列。你可能也必須非常熟悉參考文獻。
您可以創建分支鍵入的散列%BranchData
。該散列的每個元素都是對按作業鍵入的散列的引用;並且其中的每個元素都是對由timePeriod鍵入的散列的引用,並且其中的每個元素都將引用按天數鍵入的數組(使用索引1..7;它稍微分配空間,但獲得它是正確的更大;不要混淆$[
雖然!)。並且數組中的每個元素都將是對由三個句點類型鍵入的散列的引用。哎喲!
如果一切運作良好,一個典型的分配可能是這樣的:
$BranchData{$row{branch}}->{$row{job}}->{$row{period}}->[1]->{$row{p_type}} +=
$row{day1};
你會迭代元素1..7和「DAY1」 ..「第7天」;對於那裏的設計工作有一些清理。
你不得不擔心正確地初始化東西(或者你沒有 - Perl會爲你做)。我假設該行作爲直接散列(而不是散列引用)返回,並帶有分支,作業,句點,句點類型(p_type
)和每天('day1',..'day7')的鍵。 。
如果您事先知道需要哪一天,您可以避免累積所有日子,但它可以使得更一般的報告更簡單地隨時讀取和累積所有數據,然後只需打印處理任何子集的整個數據需要處理。
這是足夠有趣的問題,我已經黑了這個代碼。我懷疑它是否是最佳的,但它確實有效。
#!/usr/bin/env perl
#
# SO 8570488
use strict;
use warnings;
use Text::CSV;
use Data::Dumper;
use constant debug => 0;
my $file = "input.csv";
my $csv = Text::CSV->new({ binary => 1, eol => $/ })
or die "Cannot use CSV: ".Text::CSV->error_diag();
my @headings = qw(branch job period p_type day1 day2 day3 day4 day5 day6 day7);
my @days = qw(day0 day1 day2 day3 day4 day5 day6 day7);
my %BranchData;
open my $in, '<', $file or die "Unable to open $file for reading ($!)";
$csv->column_names(@headings);
while (my $row = $csv->getline_hr($in))
{
print Dumper($row) if debug;
my %r = %$row; # Not for efficiency; for notational compactness
$BranchData{$r{branch}} = { } if !defined $BranchData{$r{branch}};
my $branch = $BranchData{$r{branch}};
$branch->{$r{job}} = { } if !defined $branch->{$r{job}};
my $job = $branch->{$r{job}};
$job->{$r{period}} = [ ] if !defined $job->{$r{period}};
my $period = $job->{$r{period}};
for my $day (1..7)
{
# Assume that Overtime, Regular and Variance are the only types
# Otherwise, you need yet another level of checking whether elements exist...
$period->[$day] = { Overtime => 0, Regular => 0, Variance => 0} if !defined $period->[$day];
$period->[$day]->{$r{p_type}} += $r{$days[$day]};
}
}
print Dumper(\%BranchData);
鑑於你的樣本數據,從這個輸出是:
$VAR1 = {
'West' => {
'Electrician' => {
'12PM-5PM' => [
undef,
{
'Regular' => '4.25',
'Overtime' => 0,
'Variance' => 0
},
{
'Regular' => 0,
'Overtime' => 0,
'Variance' => 0
},
{
'Regular' => 0,
'Overtime' => 0,
'Variance' => 0
},
{
'Regular' => '-1.25',
'Overtime' => 0,
'Variance' => 0
},
{
'Regular' => '-1.5',
'Overtime' => 0,
'Variance' => 0
},
{
'Regular' => '-1.5',
'Overtime' => 0,
'Variance' => 0
},
{
'Regular' => 0,
'Overtime' => 0,
'Variance' => 0
}
]
}
},
'South' => {
'Manager' => {
'12A-9AM' => [
undef,
{
'Regular' => 0,
'Overtime' => '77.75',
'Variance' => 0
},
{
'Regular' => 0,
'Overtime' => '14.75',
'Variance' => 0
},
{
'Regular' => 0,
'Overtime' => 10,
'Variance' => 0
},
{
'Regular' => 0,
'Overtime' => 10,
'Variance' => 0
},
{
'Regular' => 0,
'Overtime' => 10,
'Variance' => 0
},
{
'Regular' => 0,
'Overtime' => 10,
'Variance' => 0
},
{
'Regular' => 0,
'Overtime' => 10,
'Variance' => 0
}
]
}
},
'North' => {
'Janitor' => {
'5PM-12AM' => [
undef,
{
'Regular' => 0,
'Overtime' => 0,
'Variance' => '-4.25'
},
{
'Regular' => 0,
'Overtime' => 0,
'Variance' => 0
},
{
'Regular' => 0,
'Overtime' => 0,
'Variance' => 0
},
{
'Regular' => 0,
'Overtime' => 0,
'Variance' => '-1.25'
},
{
'Regular' => 0,
'Overtime' => 0,
'Variance' => '-1.5'
},
{
'Regular' => 0,
'Overtime' => 0,
'Variance' => '-1.5'
},
{
'Regular' => 0,
'Overtime' => 0,
'Variance' => 0
}
]
}
},
'East' => {
'Banker' => {
'9AM-12PM' => [
undef,
{
'Regular' => 0,
'Overtime' => '4.25',
'Variance' => 0
},
{
'Regular' => 0,
'Overtime' => 0,
'Variance' => 0
},
{
'Regular' => 0,
'Overtime' => 0,
'Variance' => 0
},
{
'Regular' => 0,
'Overtime' => '1.25',
'Variance' => 0
},
{
'Regular' => 0,
'Overtime' => '1.5',
'Variance' => 0
},
{
'Regular' => 0,
'Overtime' => '1.5',
'Variance' => 0
},
{
'Regular' => 0,
'Overtime' => 0,
'Variance' => 0
}
]
}
}
};
有樂趣從這裏走了!
另一個值得考慮的模塊是'Text :: CSV :: Encoded',我用它來處理UTF-8。 – reinierpost 2011-12-20 09:34:12
我相信這段代碼可以滿足我的需求!我只需要以下列格式輸出到另一個CSV文件:
South,Manager,12A-9AM,77.75,14.75,16
在上面的行中,最後3個值表示三種periodTypes(加班,常規和差異)day1Values。 – user1107055 2011-12-20 17:06:08