DEV Community

Abhishek Patange
Abhishek Patange

Posted on

NumPy Basics for Data Handling in Python

When starting your journey in Data Science or Machine Learning, one of the first libraries you’ll encounter is NumPy.

Why? Because pandas, scikit-learn, TensorFlow, and PyTorch are all built on top of NumPy. If you understand NumPy arrays, you’ll have an easier time working with any data library in Python.

In this post, we’ll cover the essentials: creating arrays, indexing, reshaping, performing operations, and using useful functions.


Creating NumPy Arrays

A NumPy array, formally known as an ndarray, is the fundamental data structure provided by the NumPy (Numerical Python) library. It is a powerful, N-dimensional array object optimized for numerical and scientific computing in Python.

import numpy as np # From Python list arr = np.array([1, 2, 3, 4, 5]) print(arr) # 2D array mat = np.array([[1, 2, 3], [4, 5, 6]]) print(mat) # Special arrays zeros = np.zeros((3, 3)) # -> [[0,0,0],[0,0,0],[0,0,0]] ones = np.ones((2, 4)) # -> this will create a (2,4) matrix of 1's. rand = np.random.rand(2, 3) # -> this will create a (2,3) matrix with random numbers.  print(zeros) print(ones) print(rand) 
Enter fullscreen mode Exit fullscreen mode

Intexing and slicing

NumPy provides powerful mechanisms for accessing and manipulating elements within arrays through indexing and slicing.

matrix repesentation-> eg. mat = [[1,2,3],[4,5,6]]
-> column 0,1,2
row 0 -> [[1,2,3],
row 1 -> [4,5,6]]

arr = np.array([10, 20, 30, 40, 50]) print(arr[0]) # First element print(arr[-1]) # Last element print(arr[1:4]) # Slice [20, 30, 40]  # 2D indexing  mat = np.array([[1, 2, 3], [4, 5, 6]]) print(mat[0, 1]) # Row 0, Col 1 → 2 print(mat[:, 2]) # All rows, Col 2 → [3, 6] 
Enter fullscreen mode Exit fullscreen mode

Reshaping & Flattening

In NumPy, reshaping and flattening are fundamental operations used to manipulate the structure of arrays.
Reshaping
Reshaping an array changes its dimensions while maintaining the total number of elements. The reshape() method is used for this purpose.

arr = np.arange(1, 13) # Numbers 1–12 print(arr) # -> [1,2,3,4,5,6,7,8,9,10,11,12]  reshaped = arr.reshape(3, 4) # 3 rows, 4 cols print(reshaped) # -> [[1,2,3,4],[5,6,7,8],[9,10,11,12]]  flat = reshaped.flatten() # Back to 1D print(flat) # -> [1,2,3,4,5,6,7,8,9,10,11,12]  
Enter fullscreen mode Exit fullscreen mode

Mathematical Operations

NumPy makes vectorized operations simple (no loops required).

a = np.array([1, 2, 3, 4]) b = np.array([10, 20, 30, 40]) print(a + b) # Element-wise addition print(a * b) # Multiplication print(a ** 2) # Square each element  # Statistics print(a.mean()) # Average print(b.max()) # Maximum print(b.min()) # Minimum print(np.std(b)) # Standard deviation  
Enter fullscreen mode Exit fullscreen mode

Useful Functions

# Range of values arr = np.arange(0, 10, 2) print(arr) # [0 2 4 6 8]  # Linearly spaced values lin = np.linspace(0, 1, 5) print(lin) # [0. 0.25 0.5 0.75 1. ]  # Identity matrix I = np.eye(3) print(I) # Random values rand = np.random.randn(3, 3) # Normal distribution print(rand) 
Enter fullscreen mode Exit fullscreen mode

🧠 Why NumPy Matters in ML/AI

  1. Data Wrangling: Fast math operations on huge datasets.
  2. Matrix Algebra: Used in linear regression, neural networks, and deep learning.
  3. Interoperability: Works seamlessly with pandas, scikit-learn, TensorFlow, and PyTorch.

Top comments (0)