程式扎記: [ 常見問題 ] Pandas select rows if ID appear several time

2020年3月12日星期四

[ 常見問題 ] Pandas select rows if ID appear several time

Source From Here
Question
I have a table like this:

view plaincopy to clipboardprint?
datas = {'CustID': ['A', 'B', 'C', 'A'], 'Purchase': ['Item1', 'Item2', 'Item1', 'Item2']}  
df = pd.DataFrame.from_dict(datas)  

I would like to select rows with CustID appear more than 1 in the table.

How-To
This will work:

view plaincopy to clipboardprint?
counts = df['CustID'].value_counts()  
df[df['CustID'].isin(counts.index[counts > 1])]  

Or below code will work too:

view plaincopy to clipboardprint?
display(df['CustID'].duplicated(keep=False))  
df[df['CustID'].duplicated(keep=False)]  

This finds the rows in the data frame where there exist duplicates in the CustID column. The keep=False tells the duplicated function to mark all duplicate rows as True (as opposed to just the first or last ones)

程式扎記

標籤

2020年3月12日星期四

[ 常見問題 ] Pandas select rows if ID appear several time

沒有留言:

張貼留言

[Git 常見問題] error: The following untracked working tree files would be overwritten by merge

檢舉濫用情形

學習筆記

標籤

2020年3月12日 星期四

[ 常見問題 ] Pandas select rows if ID appear several time

沒有留言:

張貼留言

[Git 常見問題] error: The following untracked working tree files would be overwritten by merge

檢舉濫用情形

學習筆記

2020年3月12日星期四