Get Even More Visitors To Your Blog, Upgrade To A Business Listing >>

Group By operations on Pandas DataFrame

In this post and coming subsequent posts, I am going to explain the group by operations on DataFrame.

What is Group By operation?

Group By operation is used to group rows based on one or more columns in a dataframe.

 

Let’s use below data set to demonstrate the examples.

      Name  Age       City  Gender
0  Krishna   34  Bangalore    Male
1    Chamu   25    Chennai  Female
2     Joel   29  Hyderabad    Male
3     Gopi   41  Hyderabad    Male
4   Sravya   52  Bangalore  Female
5      Raj   23    Chennai    Male

 

Group by city

group_by_city = df.groupby('City')

 

group_by_city return an object of type ‘DataFrameGroupBy’, where the data is group by ‘City’. You can visualize the data like below.

Group Name: Bangalore
      Name  Age       City  Gender
0  Krishna   34  Bangalore    Male
4   Sravya   52  Bangalore  Female

Group Name: Chennai
    Name  Age     City  Gender
1  Chamu   25  Chennai  Female
5    Raj   23  Chennai    Male

Group Name: Hyderabad
   Name  Age       City Gender
2  Joel   29  Hyderabad   Male
3  Gopi   41  Hyderabad   Male

 

Group By Gender and City

If you want to group by more than one column, pass the list of columns as an argument to groupby method.

group_by_gender_city = df.groupby(['Gender', 'City'])

 

You can visualize the data of ‘group_by_gender_city’ like below.

Group Name: ('Female', 'Bangalore')
     Name  Age       City  Gender
4  Sravya   52  Bangalore  Female

Group Name: ('Female', 'Chennai')
    Name  Age     City  Gender
1  Chamu   25  Chennai  Female

Group Name: ('Male', 'Bangalore')
      Name  Age       City Gender
0  Krishna   34  Bangalore   Male

Group Name: ('Male', 'Chennai')
  Name  Age     City Gender
5  Raj   23  Chennai   Male

Group Name: ('Male', 'Hyderabad')
   Name  Age       City Gender
2  Joel   29  Hyderabad   Male
3  Gopi   41  Hyderabad   Male

 

Find the below working application.

 

hello_world.py

 

import pandas as pd


# Print the content of DataFrameGroupBy object
def print_group_by_result(group_by_object, label):
    print('*'*50)
    print(label,'\n')
    for group_name, group_data in group_by_object:
        print("Group Name:", group_name)
        print(group_data)
        print()
    print('*' * 50)


# Create a sample DataFrame
data = {'Name': ['Krishna', 'Chamu', 'Joel', 'Gopi', 'Sravya', "Raj"],
        'Age': [34, 25, 29, 41, 52, 23],
        'City': ['Bangalore', 'Chennai', 'Hyderabad', 'Hyderabad', 'Bangalore', 'Chennai'],
        'Gender': ['Male', 'Female', 'Male', 'Male', 'Female', 'Male']}

df = pd.DataFrame(data)
print(df)

group_by_city = df.groupby('City')
print('\nGroup by city is')
print('type of group_by_city is : ', type(group_by_city))
print_group_by_result(group_by_city, 'Group by city details')

group_by_gender_city = df.groupby(['Gender', 'City'])
print('\nGroup by Gender and City is')
print('type of group_by_gender_city is : ', type(group_by_gender_city))
print_group_by_result(group_by_gender_city, 'Group by Gender and City details')

 

Output

      Name  Age       City  Gender
0  Krishna   34  Bangalore    Male
1    Chamu   25    Chennai  Female
2     Joel   29  Hyderabad    Male
3     Gopi   41  Hyderabad    Male
4   Sravya   52  Bangalore  Female
5      Raj   23    Chennai    Male

Group by city is
type of group_by_city is :  
**************************************************
Group by city details 

Group Name: Bangalore
      Name  Age       City  Gender
0  Krishna   34  Bangalore    Male
4   Sravya   52  Bangalore  Female

Group Name: Chennai
    Name  Age     City  Gender
1  Chamu   25  Chennai  Female
5    Raj   23  Chennai    Male

Group Name: Hyderabad
   Name  Age       City Gender
2  Joel   29  Hyderabad   Male
3  Gopi   41  Hyderabad   Male

**************************************************

Group by Gender and City is
type of group_by_gender_city is :  
**************************************************
Group by Gender and City details 

Group Name: ('Female', 'Bangalore')
     Name  Age       City  Gender
4  Sravya   52  Bangalore  Female

Group Name: ('Female', 'Chennai')
    Name  Age     City  Gender
1  Chamu   25  Chennai  Female

Group Name: ('Male', 'Bangalore')
      Name  Age       City Gender
0  Krishna   34  Bangalore   Male

Group Name: ('Male', 'Chennai')
  Name  Age     City Gender
5  Raj   23  Chennai   Male

Group Name: ('Male', 'Hyderabad')
   Name  Age       City Gender
2  Joel   29  Hyderabad   Male
3  Gopi   41  Hyderabad   Male

**************************************************

 

 

Previous                                                 Next                                                 Home


This post first appeared on Java Tutorial : Blog To Learn Java Programming, please read the originial post: here

Share the post

Group By operations on Pandas DataFrame

×

Subscribe to Java Tutorial : Blog To Learn Java Programming

Get updates delivered right to your inbox!

Thank you for your subscription

×