Skip to content

A collection of ML notebooks covering regression, classification, and deep learning, including House Prices, Digit Recognizer, Titanic Survival, and TPlayground. Clean code, feature engineering, and advanced modeling techniques included.

Notifications You must be signed in to change notification settings

s1mer-ddc/machine-learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

📚 Machine Learning Notebooks Collection

Python
License: MIT

A curated collection of machine learning notebooks exploring different datasets and techniques including regression, classification, deep learning, and exploratory analysis.


📝 Notebooks Included

1️⃣ House Prices: Advanced Regression

  • Full regression pipeline for Kaggle House Prices dataset
  • Feature engineering: numerical combinations, binary indicators, ratios, age/time features
  • Categorical encoding, stacking models (XGB, LGBM, CatBoost, ExtraTrees)
  • Meta-modeling with Ridge regression
  • Generates optimized submission file submission_stacked_optimized.csv

2️⃣ Digit Recognizer

  • Handwritten digit classification using MNIST dataset
  • Data preprocessing and normalization
  • Model training with CNNs or classical ML algorithms
  • Model evaluation with accuracy metrics and confusion matrices
  • Predicts handwritten digits and generates submission for Kaggle

3️⃣ TPlayground

  • Interactive experimentation notebook for testing ML/DL techniques
  • Supports testing different models, feature engineering ideas, and visualizations
  • Useful for rapid prototyping and exploring model behaviors
  • Modular code to allow adding custom datasets

4️⃣ Titanic: Survival Prediction

  • Binary classification for Titanic passenger survival
  • Handles missing data, categorical encoding, and feature scaling
  • Implements multiple models: Logistic Regression, Random Forest, XGBoost
  • Model evaluation using accuracy, precision, recall, and ROC-AUC
  • Generates Kaggle-ready submission file

🚀 Features

  • Clean, modular, and well-commented notebooks
  • Combines feature engineering, data preprocessing, and model evaluation
  • Includes stacking and ensemble techniques
  • Easy to extend for other datasets or ML challenges
  • Educational and practical for Kaggle competitions

⚙️ Requirements

  • Python ≥ 3.8
  • Jupyter Notebook / Jupyter Lab
  • Key libraries:
    • numpy, pandas, matplotlib, seaborn
    • scikit-learn, xgboost, lightgbm, catboost
    • Optional: tensorflow, keras, torch (for deep learning models)

About

A collection of ML notebooks covering regression, classification, and deep learning, including House Prices, Digit Recognizer, Titanic Survival, and TPlayground. Clean code, feature engineering, and advanced modeling techniques included.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published