UNIX - 計算每行的字符/場

鑑於像這樣的數據文件的出現（即stores.dat文件）UNIX - 計算每行的字符/場

sid|storeNo|latitude|longitude 
2tt|1|-28.0372000t0|153.42921670 
9|2t|-33tt.85t09t0000|15t1.03274200

那是什麼將返回「T」的出現次數的命令每行字符？

例如。將返回：

count lineNum 
    4  1 
    3  2 
    6  3

此外，通過出現的次數做的領域是返回結果如下命令？

例如。字段't'的輸入

count lineNum 
    1  1 
    0  2 
    1  3

例如，的第3列和字符「T」

count lineNum 
    2  1 
    1  2 
    4  3

來源

2011-12-25 toop

看看http://www.gnu.org/software/gawk/manual/gawk.html其非常強大的unix工具 – Chris

http://unix.stackexchange.com/questions/18736/how-to - 每行特定字符數 - –

要計算每行字符的發生輸入可以這樣做：

awk -F'|' 'BEGIN{print "count", "lineNum"}{print gsub(/t/,"") "\t" NR}' file 
count lineNum 
4  1 
3  2 
6  3

要計數每場/列的字符可以做的發生：

塔2：

awk -F'|' -v fld=2 'BEGIN{print "count", "lineNum"}{print gsub(/t/,"",$fld) "\t" NR}' file 
count lineNum 
1  1 
0  2 
1  3

柱3：

awk -F'|' -v fld=3 'BEGIN{print "count", "lineNum"}{print gsub(/t/,"",$fld) "\t" NR}' file 
count lineNum 
2  1 
1  2 
4  3

gsub()函數的返回值是由取代的數目。所以我們用它來打印數字。
NR包含行號，所以我們用它來打印行號。
對於打印特定字段的出現，我們創建一個變量fld並放置我們希望從中提取計數的字段編號。

來源

2011-12-25 11:43:02

真棒！感謝你堅持它 - 這是有效的。 – toop

它打印出「0」（出現）以及可能不需要的輸出 –

@TarunSapra它實際上顯示爲問題中的預期結果。 –

使用perl一種可能的解決方案：

的script.pl內容：

use warnings; 
use strict; 

## Check arguments: 
## 1.- Input file 
## 2.- Char to search. 
## 3.- (Optional) field to search. If blank, zero or bigger than number 
##  of columns, default to search char in all the line. 
(@ARGV == 2 || @ARGV == 3) or die qq(Usage: perl $0 input-file char [column]\n); 

my ($char,$column); 

## Get values or arguments. 
if (@ARGV == 3) { 
     ($char, $column) = splice @ARGV, -2; 
} else { 
     $char = pop @ARGV; 
     $column = 0; 
} 

## Check that $char must be a non-white space character and $column 
## only accept numbers. 
die qq[Bad input\n] if $char !~ m/^\S$/ or $column !~ m/^\d+$/; 

print qq[count\tlineNum\n]; 

while (<>) { 
     ## Remove last '\n' 
     chomp; 

     ## Get fields. 
     my @f = split /\|/; 

     ## If column is a valid one, select it to the search. 
     if ($column > 0 and $column <= scalar @f) { 
       $_ = $f[ $column - 1]; 
     } 

     ## Count. 
     my $count = eval qq[tr/$char/$char/]; 

     ## Print result. 
     printf qq[%d\t%d\n], $count, $.; 
}

腳本接受三個參數：

輸入文件
CHAR到搜索
要搜索的列：如果列是一個壞數字，它將搜索所有行。

運行腳本不帶參數：

perl script.pl 
Usage: perl script.pl input-file char [column]

使用參數和輸出：

這裏0是一個壞列，它會搜索所有的線路。

perl script.pl stores.dat 't' 0 
count lineNum 
4  1 
3  2 
6  3

在這裏它搜索在第1列

perl script.pl stores.dat 't' 1 
count lineNum 
0  1 
2  2 
0  3

在這裏它搜索在第3欄

perl script.pl stores.dat 't' 3 
count lineNum 
2  1 
1  2 
4  3

th不是炭。

perl script.pl stores.dat 'th' 3 
Bad input

來源

2011-12-25 12:38:52 Birei

哇，必須學習perl – toop

像這樣很多，但接受更容易與bash整合的其他答案 – toop

cat stores.dat | awk 'BEGIN {FS = "|"}; {print $1}' | awk 'BEGIN {FS = "\t"}; {print NF}'

凡$1將要統計列號。

來源

2012-02-21 19:31:54 Jelena

無需AWK或Perl，只使用bash和標準的Unix工具：

cat file | tr -c -d "t\n" | cat -n | 
    { echo "count lineNum" 
    while read num data; do 
     test ${#data} -gt 0 && printf "%4d %5d\n" ${#data} $num 
    done; }

而對於一個特定的列：

cut -d "|" -f 2 file | tr -c -d "t\n" | cat -n | 
    { echo -e "count lineNum" 
    while read num data; do 
     test ${#data} -gt 0 && printf "%4d %5d\n" ${#data} $num 
    done; }

我們甚至能夠避免tr和cat S：

echo "count lineNum" 
num=1 
while read data; do 
    new_data=${data//t/} 
    count=$((${#data}-${#new_data})) 
    test $count -gt 0 && printf "%4d %5d\n" $count $num 
    num=$(($num+1)) 
done < file

和事件切割：

echo "count lineNum" 
num=1; OLF_IFS=$IFS; IFS="|" 
while read -a array_data; do 
    data=${array_data[1]} 
    new_data=${data//t/} 
    count=$((${#data}-${#new_data})) 
    test $count -gt 0 && printf "%4d %5d\n" $count $num 
    num=$(($num+1)) 
done < file 
IFS=$OLF_IFS

來源

2012-02-22 13:12:47 jfg956

grep -n -o "t" stores.dat | sort -n | uniq -c | cut -d : -f 1

給幾乎正是你想要的輸出：

4 1 
    3 2 
    6 3

感謝@ RAGHAV-的Bhushan爲grep -o提示，什麼是有用的標誌。 -n標誌也包含行號。

來源

2013-03-12 16:00:32

這是一個更加優雅和通用的解決方案。 –

+1不讓我輸入所有awk – slf

我認爲'sort -n'可以省卻 - 不是行號順序的輸出嗎？ –

$ cat -n test.txt 
1 test 1 
2 you want 
3 void 
4 you don't want 
5 ttttttttttt 
6 t t t t t t 

$ awk '{n=split($0,c,"t")-1;if (n!=0) print n,NR}' test.txt 
2 1 
1 2 
2 4 
11 5 
6 6

來源

2013-11-13 14:11:16

你也可以分割爲「T」的行或場，並檢查所得陣列的長度 - 1.將col變量爲0行或1至3列：

awk -F'|' -v col=0 -v OFS=$'\t' 'BEGIN { 
    print "count", "lineNum" 
}{ 
    split($col, a, "t"); print length(a) - 1, NR 
} 
' stores.dat

來源

2013-11-13 14:44:22

awk '{gsub("[^t]",""); print length($0),NR;}' stores.dat

對gsub（）的調用會刪除不在的行中的所有內容，然後僅顯示剩餘內容的長度和當前行號。

想要只爲第2列做？

awk 'BEGIN{FS="|"} {gsub("[^t]","",$2); print NR,length($2);}' stores.dat

來源

2014-02-16 13:54:03 vulcan

要計算每行的字符的出現次數：

$ awk -F 't' '{print NF-1, NR}' input.txt 
4 1 
3 2 
6 3

此設置字段分隔符需要被計數的字符，然後使用一個事實，即字段的數量比數大於一個分隔符。

爲先算在特定列cut OCCURENCES出該列：

$ cut -d '|' -f 2 input.txt | awk -F 't' '{print NF-1, NR}' 
1 1 
0 2 
1 3 

$ cut -d '|' -f 3 input.txt | awk -F 't' '{print NF-1, NR}' 
2 1 
1 2 
4 3

來源

2014-12-18 14:32:35 artm

perl -e 'while(<>) { $count = tr/t//; print "$count ".++$x."\n"; }' stores.dat

另一個perl的答案耶！ tr/t //函數返回在該行上發生翻譯的次數的次數，換句話說，次數爲tr找到字符't'。 ++ $ x保持行號計數。

來源

2016-05-17 15:04:45

UNIX - 計算每行的字符/場

回答

相關問題