This repository contains a Jupyter Notebook for a computer vision project, part of the "Introduction to Machine Learning" course. The project explores two key concepts:
- CNN from Scratch: Implementing a basic convolutional layer (forward and backward pass) using NumPy.
- Neural Style Transfer: Using a pre-trained VGG-19 model with PyTorch to transfer the style of one image onto another.
Course: Introduction to Machine Learning Instructor: Dr. Fatemeh Mirsalehi TA: Amirhossein Razlighi
This project requires Python and several data science libraries. A virtual environment is recommended.
-
Clone the repository:
git clone [https://github.com/YOUR_USERNAME/YOUR_REPOSITORY.git](https://github.com/YOUR_USERNAME/YOUR_REPOSITORY.git) cd YOUR_REPOSITORY -
Create and activate a virtual environment (optional but recommended):
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install the required packages: The notebook uses the following packages. You can install them via pip:
pip install torch torchvision numpy matplotlib Pillow requests
Note: The notebook also uses
torchsummaryin one cell for model visualization. You may need to install it separately if you wish to run that specific cell:pip install torchsummary
The entire project is contained in the mian.ipynb Jupyter Notebook.
- Ensure your environment is activated and packages are installed.
- Start Jupyter Lab or Jupyter Notebook:
jupyter lab
- Open
mian.ipynband run the cells sequentially to see the implementation and results.
The notebook is divided into two main sections.
This section focuses on understanding the fundamental mechanics of a convolutional layer.
MyConvClass: A custom class built with NumPy that implements theforwardandbackwardpropagation for a 2D convolution.- Verification: The implementation is tested against correct outputs and verified using numerical gradient checking (
rel_errorfunction). - Visualization: The
MyConvclass is used to apply standard image processing filters (Grayscale, Sobel X/Y edge detection, Gaussian Blur) to sample images, demonstrating the effect of different convolution kernels.
This section implements the Neural Style Transfer technique, as described by Gatys et al.
- Model: It uses the feature-extraction layers of a pre-trained VGG-19 model (from
torchvision) to capture image content and style. - Content Loss: A
ContentLossmodule is defined, which computes the Mean Squared Error (MSE) between the feature maps of the content image and the generated image. - Style Loss: A
StyleLossmodule is defined to compute the MSE between the Gram matrices of the style image's feature maps and the generated image's feature maps. This captures the correlations between features, representing texture and style. - Optimization: The
LBFGSoptimizer is used to iteratively update a target image (initialized as a clone of the content image). The optimizer works to minimize a weighted sum of the total content loss and total style loss, gradually blending the content of one image with the style of another.
The primary output of the project is the generated image from the Neural Style Transfer process. The notebook also plots the loss curves over the optimization iterations.
| Content Image | Style Image |
|---|---|
![]() | ![]() |
The final image is generated in Cell 73 after the optimization loop.

