Skip to content

callmeoutsider/amazon-sentiment-classifier

Repository files navigation

title emoji colorFrom colorTo sdk sdk_version app_file pinned
Amazon Sentiment Classifier
🧠
indigo
blue
gradio
4.44.1
app.py
false

🧠 Amazon Sentiment Classifier

A fast and effective sentiment classification project using 3.6 million Amazon product reviews.
Built with Logistic Regression, TF-IDF, and classic preprocessing — this project demonstrates scalable NLP on real-world data.


📦 Dataset

  • Source: Amazon Reviews from Kaggle (train.ft.txt)
  • Size: 3,600,000 reviews
  • Labels:
    • __label__1 = Negative
    • __label__2 = Positive
  • Format: FastText-style plain text 📥 Download dataset from Kaggle

⚠️ Due to file size limits, amazon_train.txt and amazon_test.txt are not included in this repository.
Please download them separately and place in the project root directory if you want to re-train the model.


🧰 Tools & Stack

  • Python 3.12
  • Jupyter Notebook (VS Code)
  • Pandas, Scikit-learn, Matplotlib, Seaborn

🧪 Workflow Overview

Step Description
1️⃣ Load FastText-style dataset (.txt)
2️⃣ Clean and normalize text
3️⃣ Vectorize text using TF-IDF
4️⃣ Train a Logistic Regression classifier
5️⃣ Evaluate using classification report and confusion matrix

📈 Results

  • Accuracy: ~90%
  • F1-Score: 0.90
  • Model: Logistic Regression
  • Features: Top 5000 TF-IDF terms

📊 Confusion Matrix


🚀 How to Run

  1. Clone this repository:

    git clone https://github.com/outeast98/amazon-sentiment-classifier.git cd amazon-sentiment-classifier
  2. Install dependencies:

    pip install -r requirements.txt
  3. Open the Jupyter notebook:

    sentiment_analysis.ipynb

    and run each cell step by step.


🌐 Live Interface (Gradio)

This project includes a simple web interface built with Gradio, allowing users to test sentiment classification in real time.

🖥 How to launch the app locally:

  1. Make sure you have the model and vectorizer saved as:

    • logistic_model.pkl
    • tfidf_vectorizer.pkl
  2. Install Gradio:

    pip install gradio
  3. Run the app:

    python app.py

The app will launch in your browser, where you can enter any product review and get an instant sentiment prediction:

😊 Positive 😠 Negative 


👨‍💻 Author

Yevhenii Aloshyn
Machine Learning & Cybersecurity Enthusiast
📍 Toronto, Canada
GitHub

About

Sentiment analysis of Amazon reviews using Logistic Regression and TF-IDF

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published