2020年3月12日 星期四

[ 常見問題 ] Pandas select rows if ID appear several time

Source From Here
Question
I have a table like this:
  1. datas = {'CustID': ['A''B''C''A'], 'Purchase': ['Item1''Item2''Item1''Item2']}  
  2. df = pd.DataFrame.from_dict(datas)  


I would like to select rows with CustID appear more than 1 in the table.

How-To
This will work:
  1. counts = df['CustID'].value_counts()  
  2. df[df['CustID'].isin(counts.index[counts > 1])]  


Or below code will work too:
  1. display(df['CustID'].duplicated(keep=False))  
  2. df[df['CustID'].duplicated(keep=False)]  


This finds the rows in the data frame where there exist duplicates in the CustID column. The keep=False tells the duplicated function to mark all duplicate rows as True (as opposed to just the first or last ones)

沒有留言:

張貼留言

[Git 常見問題] error: The following untracked working tree files would be overwritten by merge

  Source From  Here 方案1: // x -----删除忽略文件已经对 git 来说不识别的文件 // d -----删除未被添加到 git 的路径中的文件 // f -----强制运行 #   git clean -d -fx 方案2: 今天在服务器上  gi...