Python & SQL Study Notes – Pandas | Visualization
| Database Queries
1■■ Data Handling Using Pandas – I
Summary:
Pandas is a Python library for data analysis and manipulation. It provides two main structures:
Series (1D) and DataFrame (2D). Pandas helps read, write, merge, and analyze tabular data
easily.
Key Points:
• Series – One-dimensional labeled array (like Excel column). Created using pd.Series().
• DataFrame – Two-dimensional data structure with rows & columns.
• Access data using loc[], iloc[], slicing, and indexing.
• Add/Delete columns using assignment, del, pop(), or drop().
• Boolean indexing filters data using conditions (e.g., df[df['marks']>50]).
• File handling: pd.read_csv() and df.to_csv() for reading/writing CSV files.
Important Questions:
– Define Pandas and its two data structures.
– Differentiate between Series and DataFrame.
– Explain loc() and iloc() with examples.
– How do you add and delete columns in a DataFrame?
– What is Boolean Indexing?
– Explain different join types in Pandas.
– How to import/export CSV files using Pandas?
Python Programs:
Create Series from list:
import pandas as pd
s = pd.Series([10,20,30,40])
print(s)
Create DataFrame from dictionary:
import pandas as pd
data = {'Name':['Ravi','Asha'],'Marks':[85,90]}
df = pd.DataFrame(data)
print(df)
Select rows using loc:
print(df.loc[0])
Add & Delete column:
df['Grade']=['A','A+']
del df['Grade']
Read and Write CSV:
df.to_csv('students.csv')
new_df = pd.read_csv('students.csv')
2■■ Data Visualization
Summary:
Data Visualization helps in understanding data using graphs and charts. Python uses Matplotlib's
pyplot module to create 2D visuals like Line, Bar, Pie, Histogram, Box, and Scatter charts.
Key Points:
• Line Graph – plt.plot(x, y, color, linewidth, linestyle).
• Bar Graph – plt.bar(x, y) or plt.barh(x, y) for horizontal.
• Pie Chart – plt.pie(values, labels, explode, autopct, shadow).
• Histogram – plt.hist(data, bins, rwidth, edgecolor) shows frequency.
• Box Plot – plt.boxplot(data) shows distribution with quartiles.
• Scatter Plot – plt.scatter(x, y, color, marker).
• Save Plot – plt.savefig('file.png').
Important Questions:
– What is data visualization and why is it important?
– Explain Matplotlib and Pyplot module.
– Differentiate Bar chart and Histogram.
– What is the use of explode, autopct, and shadow in pie charts?
– Define IQR in Box plot and its formula.
– Explain Scatter plot markers with examples.
Python Programs:
Line Graph:
import matplotlib.pyplot as plt
x=[1,2,3,4]
y=[2,4,6,8]
plt.plot(x,y,color='red',linewidth=2)
plt.show()
Bar Graph:
plt.bar(['A','B','C'],[10,20,15],color='blue')
plt.show()
Pie Chart:
plt.pie([20,30,50],labels=['Math','Sci','Eng'],autopct='%.1f%%',explode=[0,0.1,0],shadow=True)
plt.show()
Histogram:
data=[10,20,30,20,10,40]
plt.hist(data,bins=4,rwidth=0.6,edgecolor='black')
plt.show()
Box Plot:
plt.boxplot([10,20,30,40,50],patch_artist=True,notch=True)
plt.show()
Scatter Plot:
x=[1,2,3,4]
y=[5,4,6,7]
plt.scatter(x,y,marker='o',color='green')
plt.show()
3■■ Database Query Using SQL
Summary:
SQL (Structured Query Language) manages data in relational databases. It allows sorting, filtering,
grouping, and performing calculations on data using aggregate functions.
Key Points:
• ORDER BY – Sorts data ascending or descending.
• Aggregate Functions – SUM(), AVG(), COUNT(), MAX(), MIN().
• GROUP BY – Divides data into logical groups for calculations.
• HAVING – Filters grouped data (used after GROUP BY).
• String Functions – LOWER(), UPPER(), TRIM(), CONCAT(), LENGTH().
• Math Functions – ROUND(), POWER(), SQRT(), MOD().
• Date & Time Functions – CURDATE(), NOW(), DAYNAME(), MONTH(), YEAR().
Important Questions:
– What is ORDER BY clause? Give example.
– Explain five aggregate functions in SQL.
– Difference between COUNT(*) and COUNT(column).
– What is GROUP BY? Write its syntax.
– Explain HAVING clause with example.
– List 5 string, 5 mathematical, and 5 date/time functions.
SQL Query Examples:
Sorting Records:
SELECT * FROM emp ORDER BY salary DESC;
Aggregate Functions:
SELECT SUM(salary), AVG(salary), MAX(salary), MIN(salary) FROM emp;
GROUP BY with HAVING:
SELECT dept, COUNT(*) FROM emp GROUP BY dept HAVING COUNT(*)>2;
String Function:
SELECT CONCAT(name, ' works in ', dept) FROM emp;
Date Function:
SELECT name, CURDATE(), DAYNAME(CURDATE()) FROM emp;