Source From Here
Question
I want to merge several strings in a dataframe based on a groupedby in Pandas.
This is my code so far:
I don't get how I can use groupby and apply some sort of concatenation of the strings in the column "text". Any help appreciated!
How-To
You can groupby the 'name' and 'month' columns, then call transform which will return data aligned to the original df and apply a lambda where we join the text entries:
Actually I can just call apply and then reset_index:
Question
I want to merge several strings in a dataframe based on a groupedby in Pandas.
This is my code so far:
- from io import StringIO
- data = StringIO("""
- "name1","hej","2014-11-01"
- "name1","du","2014-11-02"
- "name1","aj","2014-12-01"
- "name1","oj","2014-12-02"
- "name2","fin","2014-11-01"
- "name2","katt","2014-11-02"
- "name2","mycket","2014-12-01"
- "name2","lite","2014-12-01"
- """)
- # load string as stream into dataframe
- df = pd.read_csv(data,header=0, names=["name","text","date"],parse_dates=[2])
- # add column with month
- df["month"] = df["date"].apply(lambda x: x.month)
I don't get how I can use groupby and apply some sort of concatenation of the strings in the column "text". Any help appreciated!
How-To
You can groupby the 'name' and 'month' columns, then call transform which will return data aligned to the original df and apply a lambda where we join the text entries:
- df['text'] = df[['name','text','month']].groupby(['name','month'])['text'].transform(lambda x: ','.join(x))
- df = df[['name','text','month']].drop_duplicates()
Actually I can just call apply and then reset_index:
- df.groupby(['name','month'])['text'].apply(lambda x: ','.join(x)).reset_index()
This message was edited 7 times. Last update was at 11/03/2020 19:28:44
沒有留言:
張貼留言