我在Perl新手和我的功課之一,我想出了這樣一個解決方案:如何這可以在更多的Perl的完成方式
#wordcount.pl FILE
#
#if no filename is given, print help and exit
if (length($ARGV[0]) < 1)
{
print "Usage is : words.pl word filename\n";
exit;
}
my $file = $ARGV[0]; #filename given in commandline
open(FILE, $file); #open the mentioned filename
while(<FILE>) #continue reading until the file ends
{
chomp;
tr/A-Z/a-z/; #convert all upper case words to lower case
tr/.,:;!?"(){}//d; #remove some common punctuation symbols
#We are creating a hash with the word as the key.
#Each time a word is encountered, its hash is incremented by 1.
#If the count for a word is 1, it is a new distinct word.
#We keep track of the number of words parsed so far.
#We also keep track of the no. of words of a particular length.
foreach $wd (split)
{
$count{$wd}++;
if ($count{$wd} == 1)
{
$dcount++;
}
$wcount++;
$lcount{length($wd)}++;
}
}
#To print the distinct words and their frequency,
#we iterate over the hash containing the words and their count.
print "\nThe words and their frequency in the text is:\n";
foreach $w (sort keys%count)
{
print "$w : $count{$w}\n";
}
#For the word length and frequency we use the word length hash
print "The word length and frequency in the given text is:\n";
foreach $w (sort keys%lcount)
{
print "$w : $lcount{$w}\n";
}
print "There are $wcount words in the file.\n";
print "There are $dcount distinct words in the file.\n";
$ttratio = ($dcount/$wcount)*100; #Calculating the type-token ratio.
print "The type-token ratio of the file is $ttratio.\n";
我已經包含了評論提什麼確實。其實我必須從給定的文本文件中找到字數。上述程序的輸出將如下所示:
The words and their frequency in the text is:
1949 : 1
a : 1
adopt : 1
all : 2
among : 1
and : 8
assembly : 1
assuring : 1
belief : 1
citizens : 1
constituent : 1
constitute : 1
.
.
.
The word length and frequency in the given text is:
1 : 1
10 : 5
11 : 2
12 : 2
2 : 15
3 : 18
There are 85 words in the file.
There are 61 distinct words in the file.
The type-token ratio of the file is 71.7647058823529.
即使在Google的幫助下,我也可以找到我作業的解決方案。不過,我認爲使用Perl的真正威力將會有一個小而簡潔的代碼。任何人都可以用更少的代碼行給我一個Perl解決方案嗎?
根據您的使用情況報表,文件名是第二個參數。這與您的代碼相矛盾。 –
建議之一是:不要明確使用open。只需使用<>。 Perl會將ARGV中的每個參數解釋爲一個文件名,並且<>將從中讀取。 –
@WilliamPursell:是文件名是第二個參數.. – sriram