A curated collection of machine learning notebooks exploring different datasets and techniques including regression, classification, deep learning, and exploratory analysis.
- Full regression pipeline for Kaggle House Prices dataset
- Feature engineering: numerical combinations, binary indicators, ratios, age/time features
- Categorical encoding, stacking models (XGB, LGBM, CatBoost, ExtraTrees)
- Meta-modeling with Ridge regression
- Generates optimized submission file
submission_stacked_optimized.csv
- Handwritten digit classification using MNIST dataset
- Data preprocessing and normalization
- Model training with CNNs or classical ML algorithms
- Model evaluation with accuracy metrics and confusion matrices
- Predicts handwritten digits and generates submission for Kaggle
- Interactive experimentation notebook for testing ML/DL techniques
- Supports testing different models, feature engineering ideas, and visualizations
- Useful for rapid prototyping and exploring model behaviors
- Modular code to allow adding custom datasets
- Binary classification for Titanic passenger survival
- Handles missing data, categorical encoding, and feature scaling
- Implements multiple models: Logistic Regression, Random Forest, XGBoost
- Model evaluation using accuracy, precision, recall, and ROC-AUC
- Generates Kaggle-ready submission file
- Clean, modular, and well-commented notebooks
- Combines feature engineering, data preprocessing, and model evaluation
- Includes stacking and ensemble techniques
- Easy to extend for other datasets or ML challenges
- Educational and practical for Kaggle competitions
- Python ≥ 3.8
- Jupyter Notebook / Jupyter Lab
- Key libraries:
numpy,pandas,matplotlib,seabornscikit-learn,xgboost,lightgbm,catboost- Optional:
tensorflow,keras,torch(for deep learning models)