我需要一個shell腳本/ powershell,在文件中計算類似的字母數。在shell腳本文件中計數字母
輸入:
this is the sample of this script.
This script counts similar letters.
輸出:
t 9
h 4
i 8
s 10
e 4
a 2
...
我需要一個shell腳本/ powershell,在文件中計算類似的字母數。在shell腳本文件中計數字母
輸入:
this is the sample of this script.
This script counts similar letters.
輸出:
t 9
h 4
i 8
s 10
e 4
a 2
...
這一個班輪應該做的:
awk 'BEGIN{FS=""}{for(i=1;i<=NF;i++)if(tolower($i)~/[a-z]/)a[tolower($i)]++}
END{for(x in a)print x, a[x]}' file
輸出你的例子:
u 1
h 4
i 8
l 3
m 2
n 1
a 2
o 2
c 3
p 3
r 4
e 4
f 1
s 10
t 9
在PowerShell中,你可以用Group-Object
cmdlet的做到這一點:
function Count-Letter {
param(
[String]$Path,
[Switch]$IncludeWhitespace,
[Switch]$CaseSensitive
)
# Read the file, convert to char array, and pipe to group-object
# Convert input string to lowercase if CaseSensitive is not specified
$CharacterGroups = if($CaseSensitive){
(Get-Content $Path -Raw).ToCharArray() | Group-Object -NoElement
} else {
(Get-Content $Path -Raw).ToLower().ToCharArray() | Group-Object -NoElement
}
# Remove any whitespace character group if IncludeWhitespace parameter is not bound
if(-not $IncludeWhitespace){
$CharacterGroups = $CharacterGroups |Where-Object { "$($_.Name)" -match "\S" }
}
# Return the groups, letters first and count second in a default format-table
$CharacterGroups |Select-Object @{Name="Letter";Expression={$_.Name}},Count
}
這是輸出看起來像我的機器上的與樣品輸入+斷行
謝謝,但它在PS1格式中的外觀如何?我需要這樣的輸入:task.ps1 letters.txt –
@MolnárBence刪除'函數Count-Letter {}'塊,以便您的'task.ps1'文件中的第一行是'param('opening - then you can像你所描述的那樣調用它,如果你不想在你輸出的頂部輸出 –
,我把它移動到'Format-Table -HideTableHeaders'中,但是它不起作用,你可以看到問題:http ://people.inf.elte.hu/bencehun93/error。jpg –
PowerShell的一個班輪:
"this is the sample of this script".ToCharArray() | group -NoElement | sort Count -Descending | where Name -NE ' '
我會在排序之前移動篩選器(無需排序您將要丟棄的任何內容) –
echo "this is the sample of this script" | \
sed -e 's/ //g' -e 's/\([A-z]\)/\1|/g' | tr '|' '\n' | \
sort | grep -v "^$" | uniq -c | \
awk '{printf "%s %s\n",$2,$1}'
echo "this is the sample of this script. \
This script counts similar letters." | \
grep -o '.' | sort | uniq -c | sort -rg
輸出,排序,最常見的字母第一:
10 s
10
8 t
8 i
4 r
4 h
4 e
3 p
3 l
3 c
2 o
2 m
2 a
2 .
1 u
1 T
1 n
1 f
注:沒有sed
或awk
需要;一個簡單的grep -o '.'
做了所有繁重的工作。爲了不計空格和標點符號,用'[[:alpha:]]' |
替換'.'
:
echo "this is the sample of this script. \
This script counts similar letters." | \
grep -o '[[:alpha:]]' | sort | uniq -c | sort -rg
要計算資本和小寫字母爲一體,使用--ignore-case
選項sort
和uniq
:
echo "this is the sample of this script. \
This script counts similar letters." | \
grep -o '[[:alpha:]]' | sort -i | uniq -ic | sort -rg
輸出:
10 s
9 t
8 i
4 r
4 h
4 e
3 p
3 l
3 c
2 o
2 m
2 a
1 u
1 n
1 f
想要對downvote留下任何評論? – Kent
我沒有讓你失望,但這不是PowerShell。 – aphoria
@aphoria我看到'shell腳本/ powershell'將行保存在一個文件中,然後它是shell腳本。解釋也會顯示標籤爲'shell'的問題。 – Kent