程式扎記: [ Python 常見問題 ] Pandas - Groupby: Count and mean combined

2021年3月10日星期三

[ Python 常見問題 ] Pandas - Groupby: Count and mean combined

Source From Here

Question
Working with PANDAS to try and summarise a dataframe as a count of certain categories, as well as the means sentiment score for these categories. There is table full of strings which have different sentiment scores, and I want to group each text source by saying how many posts they have, as well as the average sentiment of these posts.

My (simplified) dataframe looks like this:

view plaincopy to clipboardprint?
import pandas as pd  
import numpy as np  
  
df = pd.DataFrame(data=[  
    ['bar', 'some string', 0.13],  
    ['foo', 'alt string',  -0.8],  
    ['bar', 'another str',  0.7],  
    ['foo', 'some text',   -0.2],  
    ['foo', 'more text',   -0.5]],  
    columns=['source', 'text', 'sent']  
)  

My expected output will look like this:

view plaincopy to clipboardprint?
source    count     mean_sent  
-----------------------------  
foo       3         -0.5  
bar       2         0.415  

HowTo
You can use groupby with aggregate:

view plaincopy to clipboardprint?
df.groupby('source') \  
       .agg({'text':'size', 'sent':'mean'}) \  
       .rename(columns={'text':'count','sent':'mean_sent'}) \  
       .reset_index()  

程式扎記

標籤

2021年3月10日星期三

[ Python 常見問題 ] Pandas - Groupby: Count and mean combined

沒有留言:

張貼留言

[Git 常見問題] error: The following untracked working tree files would be overwritten by merge

檢舉濫用情形

學習筆記

標籤

2021年3月10日 星期三

[ Python 常見問題 ] Pandas - Groupby: Count and mean combined

沒有留言:

張貼留言

[Git 常見問題] error: The following untracked working tree files would be overwritten by merge

檢舉濫用情形

學習筆記

2021年3月10日星期三