Multi-Class Sentiment Classification using Machine Learning and Deep Learning Techniques Saurav Singla Vikash Kumar
Introduction • Twitter, One of the realtime feedback platform for any type of products, issues or other topics which are used as getting the opinions of users based on the comments or tweet posted on behalf of respective topics. • These tweets comprises of text data with various types of emotics or meaningless words. • Lots of preprocessing along with vectorization method are applied. • Multi-class classification sentiment are done to get various sentiment to get wide knowledge of the sentiments.
Problem • Finding a sentiment of a text data is it’s self a challenging task as we used have large number of feedback/text data and we can’t find its sentiment manually. • Positive, Neutral and Negative doesn’t provide the enough information about a subject (products, text, data etc.) • In this case, tweet data (text) contains more number of sentiments (Sadness, Boredom, Neutral, Worry, Surprise, Love, Fun, Hate, Happiness, Anger, Relief) which seems a challenging task. • Classifying this multi-class problem and finding the real sentiment of the tweet data (text data) provides us real and genuine sentiment against the product.
Business Needs • To process more and more new products in market, companies need real time feedback against their products. • These feedbacks contain large number of data and finding sentiment manually is tough task. • With the Natural Language Processing (NLP), we process the opinion and find the sentiment of the data. • Based on these sentiment, We can take various decisions like change in the advertisement program, geo based marketing, increment of supply chain and many more business centric decision are made.
Approach • We apply Random Forest, along with many deep learning framework such as LSTM, Bi-LSTM, GRUs, and BERT to classify the text data into various sentiments. • To convert these text data to vector form, different vectorization techniques such as tf-idf, word2vec, and glove are after done after some essential steps of text preprocessing and then these vectors are fed into Machine and deep learning models. • Text preprocessing steps include • Lowercase • Remove punctuation, URLs and handles • Removing stop word • Stemming • Tokenize sentence
Approach Cont. Fig. 1 System Diagram of Proposed Methodology
Solution • We will follow the last model –M5 named as BERT model which gives the best sentiment among all the approaches we applied given in approach section. • This approach tends to classify the test data with higher accuracy of 40% on multi-class sentiment data.
Benefits • The final outcomes of this work will be as follows : • We don’t have to rely on the polarity i.e. Positive, Neutral and Negative sentiment. • Will provide information about the various types sentiments/opinions related the product. • Decision making process will become easier as we will have wide knowledge of the opinions of the users. • Analyzation time will be decreases.
Results • Overall accuracy of 40% is recorded during the testing of our model – M5 and provide a good knowledge related to opinions of the users and which will definitely help to take various types of decisions. • These decisions help to increase the business in smooth manner with user centric way.

Multi-Class Sentiment Classification using Machine Learning and Deep Learning Techniques

  • 1.
    Multi-Class Sentiment Classificationusing Machine Learning and Deep Learning Techniques Saurav Singla Vikash Kumar
  • 2.
    Introduction • Twitter, Oneof the realtime feedback platform for any type of products, issues or other topics which are used as getting the opinions of users based on the comments or tweet posted on behalf of respective topics. • These tweets comprises of text data with various types of emotics or meaningless words. • Lots of preprocessing along with vectorization method are applied. • Multi-class classification sentiment are done to get various sentiment to get wide knowledge of the sentiments.
  • 3.
    Problem • Finding asentiment of a text data is it’s self a challenging task as we used have large number of feedback/text data and we can’t find its sentiment manually. • Positive, Neutral and Negative doesn’t provide the enough information about a subject (products, text, data etc.) • In this case, tweet data (text) contains more number of sentiments (Sadness, Boredom, Neutral, Worry, Surprise, Love, Fun, Hate, Happiness, Anger, Relief) which seems a challenging task. • Classifying this multi-class problem and finding the real sentiment of the tweet data (text data) provides us real and genuine sentiment against the product.
  • 4.
    Business Needs • Toprocess more and more new products in market, companies need real time feedback against their products. • These feedbacks contain large number of data and finding sentiment manually is tough task. • With the Natural Language Processing (NLP), we process the opinion and find the sentiment of the data. • Based on these sentiment, We can take various decisions like change in the advertisement program, geo based marketing, increment of supply chain and many more business centric decision are made.
  • 5.
    Approach • We applyRandom Forest, along with many deep learning framework such as LSTM, Bi-LSTM, GRUs, and BERT to classify the text data into various sentiments. • To convert these text data to vector form, different vectorization techniques such as tf-idf, word2vec, and glove are after done after some essential steps of text preprocessing and then these vectors are fed into Machine and deep learning models. • Text preprocessing steps include • Lowercase • Remove punctuation, URLs and handles • Removing stop word • Stemming • Tokenize sentence
  • 6.
    Approach Cont. Fig. 1System Diagram of Proposed Methodology
  • 7.
    Solution • We willfollow the last model –M5 named as BERT model which gives the best sentiment among all the approaches we applied given in approach section. • This approach tends to classify the test data with higher accuracy of 40% on multi-class sentiment data.
  • 8.
    Benefits • The finaloutcomes of this work will be as follows : • We don’t have to rely on the polarity i.e. Positive, Neutral and Negative sentiment. • Will provide information about the various types sentiments/opinions related the product. • Decision making process will become easier as we will have wide knowledge of the opinions of the users. • Analyzation time will be decreases.
  • 9.
    Results • Overall accuracyof 40% is recorded during the testing of our model – M5 and provide a good knowledge related to opinions of the users and which will definitely help to take various types of decisions. • These decisions help to increase the business in smooth manner with user centric way.