After playing some aggregation and grouping in the last part, now we will play harder with table joins.
The Playground Database
We will be using the same SQLite database, but now we are going to use some tables. So get all the required csv files here
Preparing the DataFrame
import pandas as pd albums_df = pd.read_csv("albums.csv") artists_df = pd.read_csv("artists.csv")
Join Queries into Pandas DataFrame
INNER JOIN:
SQL
:
SELECT * FROM albums JOIN artists ON albums.ArtistId = artists.ArtistId
or
SELECT * FROM albums INNER JOIN artists ON albums.ArtistId = artists.ArtistId
Pandas
:
# For the exact same column name on both table albums_df.merge(artists_df, on='ArtistId') # Defining the join column of each tables albums_df.merge(artists_df, left_on='ArtistId', right_on='ArtistId') # To make sure we use the INNER one albums_df.merge(artists_df, left_on='ArtistId', right_on='ArtistId', how='inner')
LEFT JOIN
SQL
:
SELECT * FROM albums LEFT JOIN artists ON albums.ArtistId = artists.ArtistId
Pandas
:
albums_df.merge(artists_df, on='ArtistId', how='left')
RIGHT JOIN
SQL
:
SELECT * FROM albums RIGHT JOIN artists ON albums.ArtistId = artists.ArtistId
Pandas
:
albums_df.merge(artists_df, on='ArtistId', how='right')
Top comments (0)