Not able to fetch all the columns of the Dataframe after applying groupby method of Pandas
I have a sample Dataframe as below.
col1 col2 day col4
0 a1 b1 monday c1
1 a2 b2 tuesday c2
2 a3 b3 wednesday c3
3 a1 b1 monday c5
Here 'a1 b1 monday' are repeated twice. So after groupby the output should be:
col1 col2 day col4 count
a1 b1 monday c1 2
a2 b2 tuesday c2 1
a3 b3 wednesday c3 1
I tried using df.groupby(['col1','day'],sort=False).size().reset_index(name='Count')
and
df.groupby(['col1','day']).transform('count')
and the output is always
col1 day count
a1 monday 2
a2 tuesday 开发者_如何学Python 1
a3 wednesday 1
where as my original data have 14 columns and it is not making sense to keep all the column names in groupby statement. Is there a better pythonic way to achieve this??
First groupby
with transform
to make your count
column.
Then use drop_duplicates
to remove duplicate rows:
df['count'] = df.groupby(['col1','day'],sort=False)['col1'].transform('size')
df.drop_duplicates(['col1', 'day'], inplace=True)
print(df)
col1 col2 day col4 count
0 a1 b1 monday c1 2
1 a2 b2 tuesday c2 1
2 a3 b3 wednesday c3 1
精彩评论