How are dataframes in Pandas merged?



Pandas is an open-source Python Library providing high-performance data manipulation and analysis tool using its powerful data structures. A Data frame in Pandas is a two-dimensional data structure, i.e., data is aligned in a tabular fashion in rows and columns.

In this article, we will see how to merge dataframes in Python. We will use the merge() method. Following is the syntax:

dataframe.merge(right, how, on, left_on, right_on, left_index, right_index, sort, suffixes, copy, indicator, validate) 

Here,

Parameter Value Description
right A DataFrame or a Series to merge with
how

'left'

'right'

'outer'

'inner': default

'cross'

How to merge.
on String

List

The level to do the merging
left_on String

List

The level to do the merging on the DataFrame to the left
right_on String

List

The level to do the merging on the DataFrame to the right
left_index

True

False

Whether to use the index from the left DataFrame as join key or not
right_index

True

False

Whether to use the index from the right DataFrame as join key or not
sort

True

False

Whether to sort the DataFrame by the join key or not
suffixes List A list of strings to add for overlapping columns
copy

True

False

Merge Dataframes using the merge() method with keys from right dataframe

To merge dataframes, we will use the merge() method. The right value of the how parameter use only keys from right frame, similar to a SQL right outer join.

Example

import pandas as pd # Create Dictionaries dct1 = {'Player':['Steve','David'], 'Age':[29, 25,]} dct2 = {'Player':['Steve','Kane'], 'Age':[31, 27]} # Create DataFrame from Dictionary elements using pandas.dataframe() df1 = pd.DataFrame(dct1) df2 = pd.DataFrame(dct2) print("DataFrame1 = \n",df1) print("\nDataFrame2 = \n",df2) # Combining DataFrames using the merge() method res = df1.merge(df2, how='right') print("\nCombined DataFrames = \n",res)

Output

DataFrame1 = Player Age 0 Steve 29 1 David 25 DataFrame2 = Player Age 0 Steve 31 1 Kane 27 Combined DataFrames = Player Age 0 Steve 31 1 Kane 27 

Merge Dataframes using the merge() method with keys from left dataframe

To merge dataframes, we will use the merge() method. The left value of the how parameter use only keys from left frame, similar to a SQL left outer join.

Example

import pandas as pd # Create Dictionaries dct1 = {'Player':['Steve','David'], 'Age':[29, 25,]} dct2 = {'Player':['Steve','Kane'], 'Age':[31, 27]} # Create DataFrame from Dictionary elements using pandas.dataframe() df1 = pd.DataFrame(dct1) df2 = pd.DataFrame(dct2) print("DataFrame1 = \n",df1) print("\nDataFrame2 = \n",df2) # Combining DataFrames using the merge() method # The how parameter is set to left res = df1.merge(df2, how='left') print("\nCombined DataFrames = \n",res)

Output

DataFrame1 = Player Age 0 Steve 29 1 David 25 DataFrame2 = Player Age 0 Steve 31 1 Kane 27 Combined DataFrames = Player Age 0 Steve 29 1 David 25 

Merge Dataframes with union of keys from both dataframes

To merge dataframes, we will use the merge() method. The outer value of the how parameter use union of keys from both the frames, similar to a SQL full outer join.

Example

import pandas as pd # Create Dictionaries dct1 = {'Player':['Steve','David'], 'Age':[29, 25,]} dct2 = {'Player':['Steve','Kane'], 'Age':[31, 27]} # Create DataFrame from Dictionary elements using pandas.dataframe() df1 = pd.DataFrame(dct1) df2 = pd.DataFrame(dct2) print("DataFrame1 = \n",df1) print("\nDataFrame2 = \n",df2) # Combining DataFrames using the merge() method # The how parameter is set to outer i.e. res = df1.merge(df2, how='outer') print("\nCombined DataFrames = \n",res)

Output

DataFrame1 = Player Age 0 Steve 29 1 David 25 DataFrame2 = Player Age 0 Steve 31 1 Kane 27 Combined DataFrames = Player Age 0 Steve 29 1 David 25 2 Steve 31 3 Kane 27 

Merge Dataframes with intersection of keys from both dataframes

To merge dataframes, we will use the merge() method. The inner value of the how parameter use intersection of keys from both the frames, similar to a SQL inner join.

Example

import pandas as pd # Create Dictionaries dct1 = {'Player':['Steve','David'], 'Age':[29, 25,]} dct2 = {'Player':['Steve','Kane'], 'Age':[31, 27]} # Create DataFrame from Dictionary elements using pandas.dataframe() df1 = pd.DataFrame(dct1) df2 = pd.DataFrame(dct2) print("DataFrame1 = \n",df1) print("\nDataFrame2 = \n",df2) # Combining DataFrames using the merge() method # The how parameter is set to inner res = df1.merge(df2, how='inner') print("\nCombined DataFrames = \n",res)

Output

DataFrame1 = Player Age 0 Steve 29 1 David 25 DataFrame2 = Player Age 0 Steve 31 1 Kane 27 Combined DataFrames = Empty DataFrame Columns: [Player, Age] Index: [] 

Merge Dataframes with cartesian product from both dataframes

To merge dataframes, we will use the merge() method. The cross value of the how parameter creates the cartesian product from both the frames:

Example

import pandas as pd # Create Dictionaries dct1 = {'Player':['Steve','David'], 'Age':[29, 25,]} dct2 = {'Player':['Steve','Kane'], 'Age':[31, 27]} # Create DataFrame from Dictionary elements using pandas.dataframe() df1 = pd.DataFrame(dct1) df2 = pd.DataFrame(dct2) print("DataFrame1 = \n",df1) print("\nDataFrame2 = \n",df2) # Combining DataFrames using the merge() method # The how parameter is set to cross i.e. cartesian product res = df1.merge(df2, how='cross') print("\nCombined DataFrames = \n",res)

Output

DataFrame1 = Player Age 0 Steve 29 1 David 25 DataFrame2 = Player Age 0 Steve 31 1 Kane 27Combined DataFrames = Player_x Age_x Player_y Age_y 0 Steve 29 Steve 31 1 Steve 29 Kane 27 2 David 25 Steve 31 3 David 25 Kane 27 
Updated on: 2022-09-15T13:43:21+05:30

317 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements