2020年10月23日 星期五

[Linux 常見問題] Getting the count of unique values in a column in bash

 Source From Here

Question
I have tab delimited files with several columns. I want to count the frequency of occurrence of the different values in a column for all the files in a folder and sort them in decreasing order of count (highest count first). How would I accomplish this in a Linux command line environment?

It can use any common command line language like awk, perl, python etc.

HowTo
To see a frequency count for column two (for example):
# awk -F '\t' '{print $2}' * | sort | uniq -c | sort -nr

fileA.txt
  1. z    z    a  
  2. a    b    c  
  3. w    d    e  
fileB.txt
  1. t    r    e  
  2. z    d    a  
  3. a    g    c  

fileC.txt
  1. z    r    a  
  2. v    d    c  
  3. a    m    c  
Result:
  1. 3 d  
  2. 2 r  
  3. 1 z  
  4. 1 m  
  5. 1 g  
  6. 1 b  


沒有留言:

張貼留言

[Git 常見問題] error: The following untracked working tree files would be overwritten by merge

  Source From  Here 方案1: // x -----删除忽略文件已经对 git 来说不识别的文件 // d -----删除未被添加到 git 的路径中的文件 // f -----强制运行 #   git clean -d -fx 方案2: 今天在服务器上  gi...