Different Types of Joins in Pandas



Pandas is one of the popular libraries used to perform data analysis and data manipulation. There are many advanced features to work with the tabular data such as join multiple data frames into one depending upon the common columns or indices of columns.

In python, there are different types of joins available which can be performed by using the merge() function along with the how parameter of the pandas library. Following are the different joins.

  • Inner Join

  • Outer Join

  • Left Join

  • Right Join

  • Cross Join

Inner Join

An Inner Join in the pandas library will return the rows which have the same values in both defined data frames and in the output. We can observe the columns of the dataframe that have the same columns. The following is the syntax for applying the inner join on the data frames.

 import pandas as pd pd.merge(dataframe1,dataframe2,how='inner' 

Where,

  • pandas is the name of the library.

  • pd is the alias name.

  • merge is the function to perform the join operation.

  • how specifies the type of join operation.

Example

In this example, we will pass two data frames to the merge() function along with the ?how' parameter by assigning inner join. Following is the code.

 import pandas as pd dic1 = {"Name": ["John","Mark","Nancy","Sita","Ram"], "Age": [20,40,78,90,38], "Language" : ["Python","Java","C","Python","Salesforce"]} dic2 = {"Name": ["John","Mark","Nancy","Sita","Ram"], "Age": [20,40,78,90,38], "class" : [10,9,12,9,5]} data1 = pd.DataFrame(dic1) print(data1) data2 = pd.DataFrame(dic2) print(data2) inner_join = pd.merge(data1,data2,how = "inner") inner_join.head() 

Output

 Name Age Language 0 John 20 Python 1 Mark 40 Java 2 Nancy 78 C 3 Sita 90 Python 4 Ram 38 Salesforce Name Age class 0 John 20 10 1 Mark 40 9 2 Nancy 78 12 3 Sita 90 9 4 Ram 38 5 

Outer Join

The outer join will return all the rows of both the data frames and if any missing data in the rows then it fills the null values. Following syntax is used for applying the outer join on data frames.

 import pandas as pd pd.merge(dataframe1,dataframe2,how='outer' 

Example

 import pandas as pd dic1 = {"Name": ["John","Mark","Nancy","Sita","Ram"], "Age": [20,40,78,90,38], "Language" : ["Python","Java","C","Python","Salesforce"]} dic2 = {"Name": ["John","Nancy","Sita","Ram"], "Age": [20,40,70,38], "class" : [10,12,9,5]} data1 = pd.DataFrame(dic1) print(data1) data2 = pd.DataFrame(dic2) print(data2) inner_join = pd.merge(data1,data2,how = "outer") inner_join.head() 

Output

 Name Age Language 0 John 20 Python 1 Mark 40 Java 2 Nancy 78 C 3 Sita 90 Python 4 Ram 38 Salesforce Name Age class 0 John 20 10 1 Nancy 40 12 2 Sita 70 9 3 Ram 38 5 

Left Join

The left join returns the entire rows of the left dataframe and all the matching rows of the right dataframe. If in case, there are no matching rows in the right data frame then the output data frame will be filled with the null values in the corresponding columns. The following is the syntax for applying the left join.

 import pandas as pd pd.merge(dataframe1,dataframe2,how='left' 

Example

Here in this example, we will apply the left join on the two input data frames then the output of the left join will be returned.

 import pandas as pd dic1 = {"Name": ["John","Mark","Nancy","Sita","Ram"], "Age": [2,40,78,90,38], "Language" : ["Python","Java","C","Python","Salesforce"]} dic2 = {"Name": ["John","Nancy","Sita","Ram"], "Age": [20,40,70,38], "class" : [10,12,9,5]} data1 = pd.DataFrame(dic1) print(data1) data2 = pd.DataFrame(dic2) print(data2) inner_join = pd.merge(data1,data2,how = "left") inner_join.head() 

Output

 Name Age Language 0 John 2 Python 1 Mark 40 Java 2 Nancy 78 C 3 Sita 90 Python 4 Ram 38 Salesforce Name Age class 0 John 20 10 1 Nancy 40 12 2 Sita 70 9 3 Ram 38 5 

Right Join

The right join returns the entire rows of the right dataframe and all the matching rows of the left dataframe. If in case there are no matching rows in the left data frame then the output data frame will be filled with the null values in the corresponding columns. The following is the syntax for applying the right join.

 import pandas as pd pd.merge(dataframe1,dataframe2,how='right' 

Example

In this example, we will apply the right join on the two input data frames then the output of the right join will be returned.

 import pandas as pd dic1 = {"Name": ["John","Mark","Nancy","Sita","Ram"], "Age": [2,40,78,90,38], "Language" : ["Python","Java","C","Python","Salesforce"]} dic2 = {"Name": ["John","Nancy","Sita","Ram"], "Age": [20,40,70,38], "class" : [10,12,9,5]} data1 = pd.DataFrame(dic1) print(data1) data2 = pd.DataFrame(dic2) print(data2) inner_join = pd.merge(data1,data2,how = "right") inner_join.head() 

Output

 Name Age Language 0 John 2 Python 1 Mark 40 Java 2 Nancy 78 C 3 Sita 90 Python 4 Ram 38 Salesforce Name Age class 0 John 20 10 1 Nancy 40 12 2 Sita 70 9 3 Ram 38 5 

Cross Join

The cross join returns the Cartesian product of the two data frames; meaning it returns all the possible combinations of rows from both data frames.

Updated on: 2023-10-20T12:43:50+05:30

516 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements