程式扎記: [Linux 常見問題] Getting the count of unique values in a column in bash

2020年10月23日星期五

[Linux 常見問題] Getting the count of unique values in a column in bash

Source From Here

Question
I have tab delimited files with several columns. I want to count the frequency of occurrence of the different values in a column for all the files in a folder and sort them in decreasing order of count (highest count first). How would I accomplish this in a Linux command line environment?

It can use any common command line language like awk, perl, python etc.

HowTo
To see a frequency count for column two (for example):

# awk -F '\t' '{print $2}' * | sort | uniq -c | sort -nr

fileA.txt

view plaincopy to clipboardprint?
z    z    a  
a    b    c  
w    d    e  

fileB.txt

view plaincopy to clipboardprint?
t    r    e  
z    d    a  
a    g    c  

fileC.txt

view plaincopy to clipboardprint?
z    r    a  
v    d    c  
a    m    c  

Result:

view plaincopy to clipboardprint?
3 d  
2 r  
1 z  
1 m  
1 g  
1 b  

程式扎記

標籤

2020年10月23日星期五

[Linux 常見問題] Getting the count of unique values in a column in bash

沒有留言:

張貼留言

[Git 常見問題] error: The following untracked working tree files would be overwritten by merge

檢舉濫用情形

學習筆記

標籤

2020年10月23日 星期五

[Linux 常見問題] Getting the count of unique values in a column in bash

沒有留言:

張貼留言

[Git 常見問題] error: The following untracked working tree files would be overwritten by merge

檢舉濫用情形

學習筆記

2020年10月23日星期五