CHATBOT using NLP & Machine Learning
MINOR PROJECT REPORT
By
RAHUL GUPTA (RA2111047010065)
AMAN RAJ (RA2111047010105)
VATSAL TYAGI(RA2111047010118)
AKSHAR KANKAR (RA2111047010125)
Under the guidance of
Dr. Antony Sophia N
In partial fulfilment for the Course
of
18AIE339T - Matrix Theory of AI
in CINTEL
FACULTY OF ENGINEERING AND TECHNOLOGY
SCHOOL OF COMPUTING
SRM INSTITUTE OF SCIENCE AND TECHNOLOGY
KATTANKULATHUR
NOVEMBER 2023
SRM INSTITUTE OF SCIENCE AND TECHNOLOGY
(Under Section 3 of UGC Act, 1956)
BONAFIDE CERTIFICATE
Certified that this minor project report for the course 18AIE339T – Matrix Theory of AI
entitled in "Chatbot using NLP & Machine Learning" is the Bonafide work of Rahul Gupta
(RA2111047010065), Aman Raj (RA2111047010105), Vatsal Tyagi (RA2111047010118)
and Akshar Kankar (RA2111047010125) who carried out the work under my supervision.
SIGNATURE SIGNATURE
Dr. Antony Sophia N Dr. Annie Uthra
Assistant Professor Professor and Head
CINTEL CINTEL
SRM Institute of Science and Technology SRM Institute of Science and Technology,
Kattankulathur Kattankulathur
ABSTRACT
Chatbots are computer programs that simulate human conversations,
often used in customer service, education, and entertainment. Natural
language processing (NLP) is a subfield of artificial intelligence that
focuses on developing algorithms and techniques to understand and
process human language. NLP is used in various applications,
including chatbots, machine translation, and speech recognition.
Chatbots powered by NLP can provide a personalized and engaging
experience, answering customer questions, providing product
recommendations, and teaching people new things. As NLP
technology continues to evolve, they will become even more
powerful and versatile tools, able to understand and respond to human
language in a more natural way. In the future, chatbots powered by
NLP will play an increasingly important role in our lives, playing a
wider range of services from customer service to education
and entertainment.
ACKNOWLEDGEMENT
We express our heartfelt thanks to our honorable Vice Chancellor Dr. C.
MUTHAMIZHCHELVAN, for being the beacon in all our endeavors. We would
like to express my warmth of gratitude to our Registrar Dr. S. Ponnusamy, for
his encouragement. We express our profound gratitude to our Dean (College of
Engineering and Technology) Dr. T.V.Gopal, for bringing out novelty in all
executions. We would like to express my heartfelt thanks to Chairperson, School
of Computing Dr. Revathi Venkataraman, for imparting confidence to
complete my course project. We wish to express my sincere thanks to Course
Audit Professors Dr. Lakshmi.C , Professor, Department of Computational
and Course Coordinators for their constant encouragement and support.
We are highly thankful to our my77 Course project Faculty Dr. Antony Sophia
N, Assistant Professor , CINTEL , for his/her assistance, timely suggestion
and guidance throughout the duration of this course project. We extend my
gratitude to our HOD Annie Uthra, Head and Professor, CINTEL and my
Departmental colleagues for their Support.
Finally, we thank our parents and friends near and dear ones who directly and
indirectly contributed to the successful completion of our project. Above all, I
thank the almighty for showering his blessings on me to complete my Course
project.
TABLE OF CONTENT
1) Abstract...............................................................................................................6
2) Chapter 1-Introduction
1.1 Introduction...................................................................................................9
1.2 Problem Statement.........................................................................................13
1.3 Methodology...................................................................................................14
1.3.1 Chatbot...................................................................................................15
1.3.2 Rasa........................................................................................................18
1.3.3 Flask.......................................................................................................19
1.3.4 Chat Widget...........................................................................................19
1.4 Organization...................................................................................................20
3) Chapter 2- Literature Survey
2.1 Research Papers and articles............................................................................21
2.2 Python...............................................................................................................22
2.3 Work Flow........................................................................................................24
2.4 Libraries.............................................................................................................24
4) Chapter 3- System Development
3.1 Proposed Model.................................................................................................27
3.2 Real Time Project…..........................................................................................29
3.3 Hardware and software requirements...............................................................40
3.4 Concepts requirements.......................................................................................40
5) Chapter 4- Performance Analysis
4.1 Steps of the proposed model..............................................................................41
6) Chapter 5-Conclusions
5.1 Conclusion...........................................................................................................42
5.2 Future Work........................................................................................................43
7) References..................................................................................................................44
LIST OF FIGURES
Figure Number/Name Page No.
Figure 1- Chatbot 9
Figure 2 - Spiral Model 15
Figure 3 – Zone of state 26
Figure 4 – Cases on date 26
Figure 5 – Growth rate 27
Figure 6 – Chat server 32
Figure 7 – Chat server Process 32
Figure 8 – Queries in Chat server 33
Figure 9 – Diff Queries in Chat server 33
Figure 10 – AI Server 34
Figure 11 – Queries in AI Server 34
Figure 12 – Diff Queries in AI Server 34
Figure 13 – All models 37
Figure 14 – Emotion model 37
Figure 15 – Sentiment model 37
Figure 16 – Intention model 38
Figure 17 – Predicted emotions 38
Figure 18 – Predicted sentiments 39
Figure 19 – Predicted intentions 39
ABSTRACT
CHAPTER - 1
INTRODUCTION
1.1 INTRODUCTION
Figure 1- Chatbot
A chatbot is a computer program that simulates human conversations or chat through AI. Artificial
Intelligence (AI) increasingly Integrates our daily lives with the creation and analysis of intelligent
software and hardware, called intelligent agents. Intelligent agents can do a variety of tasks ranging.
7
sophisticated operations. A chatbots a typical example of an AI system and one of the most elementary
and widespread examples of intelligent Human-Computer Interaction (HCI). It is a computer pro-
gram, which responds like a smart entity when conversed with through text or voice and understands
one or more human languages by Natural Language Processing (NLP).In the lexicon, a chatbot is
defined as “A computer program designed to simulate con-versation with human users, especially over
the Internet”. Chatbots are also known as smart bots, interactive agents, digital assistants, or artificial
conversation entities. Chatbots can mimic human conversation and entertain users but they are not
built only for this. They are useful in applications such as education, information retrieval, business,
and e-commerce. They became so popular because there are many advantages of chatbots for users
and developers too. Most implementations are platform-independent and instantly available to users
without needed installations. Contact to the chatbot is spread through a user’s social graph without
leaving the messaging app the chatbot lives in, which provides and guarantees the user’s identity.
Moreover, payment services are integrated into the messaging system and can be used safely and
reliably, and a notification system re-engages inactive users. Chatbots are integrated with group
conversations or shared just like any other contact, while multiple conversations can be carried
forward in parallel. Knowledge in the use of one chatbot is easily transferred to the usage of other
chatbots, and there are limited data requirements. Communication reliability, fast and uncomplicated
development iterations, lack of version fragmentation, and limited design efforts for the interface are
some of the advantages for developers too.
8
How to set-up the environment
In ubuntu 18.04, these are steps to be followed to setup the project env Step
1: Create a Virtual Env
Step 2: Activate the virtual env
Step 3: Deactivate the conda env if it exits
Step 4: Install the requirements
Training the NLU model
Since the release of Rasa 1.0, the training of the NLU models became a lot easier with the new CLI. Train
the model by running:
rasa train nlu
Once the model is trained, test the model:
rasa shell NLP
9
Training the dialogue model
The biggest change in how Rasa Core model works is that custom action 'actions' now needs to run on a
separate server. That server has to be configured in a 'endpoints.yml' file. This is how to train and run the
dialogue management model:
1. Start the custom action server by running:
rasa run actions
2. Open a new terminal and train the Rasa Core model by running:
rasa train
3. Talk to the chatbot once it is loaded after running:
rasa shell
Starting the interactive training session:
To run your assistant in a interactive learning session, run:
1. Make sure the custom actions server is
running: rasa run actions
2. Start the interactive training session by
running: rasa interactive
Steps to run the complete project
Run the below steps in different terminals
Step 1: Run the Flask server for getting covid data
Python app.py
10
Step 2: Run the Actions Server
rasa run actions
Step 3: Run the Shell to interact with bot
rasa shell
Integration with Frontend Website
Step 1: Run the Actions Server
rasa run actions 12
Step 2: Run the rasa model
python -m rasa run --m ./models --endpoints endpoints.yml --port 5005 --
cors "*" -vv --enable-api
Step 3: Run the Flask Server for backend data
python app.py
Step 4: Run the HTTP server for running website for chatbot.
The chatbot UI is provided in index.html forked from another repo is also placed for individual
learnings into chatbot frontend.
Here 8008 is port number, u can change if needed
The chatbot is ready at http://localhost:8008
11
1.2 PROBLEM STATEMENT
The coronavirus outbreak has serious consequences for the community around the world. People are
concerned and have many questions. The World Health Organization provides answers to the most asked
questions about coronavirus on their website. However, you may need to take a moment to find the right
answer to your question. It is important that people are well aware of current trends. In this way we can
effectively reduce the spread of bulk. A chatbot can totally help with this.
12
1.3 METHODOLOGY
Software development methodologies are the methods to manage project development. Many
methodological models are available. However, developer or users must know what to use and what not in
every situation. But keep in mind tht the project efficiency and know how to manage problems and avoid it
maximum times., by following this it also helps with our project goal and scope. To create a project, you
need to understand the needs of your stakeholders. A methodology is a system that includes the steps of
transforming raw data into recognized data patterns to extract userknowledge.
Figure 2- Spiral Model
13
Four Phases of Spiral Model are:
1. Planning:
The stage where requirements are stored and risk management is assessed. During the section, we
discuss the title of the project with the project manager. Needs and risks were managed after litigation
research in one of the available studies.
2. Risk Analysis:
This is the phase in which risks and alternative solutions are identified. At the end of this phase, a
prototype is created. If there is a risk in this phase, another solution is suggested.
3. Engineering:
In this stage, the model was implemented.
4. Evaluation:
In this, we perform software evaluation which is done after the system is shown and we test whether
our system meets the expectations and requirements. But if in case an error occurs, we can report the
problem through the system.
14
1.3.1 Chatbot
Chatbot seems to have a great promise to provide us with fast and easy support that answers
directly to their following questions. The most common aim for chatbot users is to considered to
be more and more productive, while other aim for entertainment, social features, and interaction
with new features. However, to measure the motives mentioned above, a chatbot should be
constructed in such a way that it acts as a way, a game, and a buddy at the same time. Reduction
in customer service costs and the ability to handle multiple users at a given point of time which
will be some of the reasons why chatbots are so popular with business groups. Chatbots are no
longer seen as friends or a source of help, and their method of communication draws them closer
to users as buddies.
Pattern matching is predicted in response blocks. The line is entered, and the result is
generated according to the user input. The main drawback of this method is that the results are
predictable, repetitive, and unaffordable. Also, it usually have no storage for past responses,
which can lead to major discussions.
15
Natural Language Processing (NLP), an area of artificial intelligence, explores the
manipulation of natural language text or speech by computers. Knowledge of the understanding
and use of human language is gathered to develop techniques that will make computers
understand and manipulate natural expressions to perform desired tasks. Most NLP techniques
are based on machine learning.
Natural Language Understanding (NLU) is at the core of any NLP task. It is a technique to
implement natural user interfaces such as a chatbot. NLU aims to extract context and meanings
from natural language user inputs, which may be unstructured and respond appropriately
according to user intention. It identifies user intent and extracts domain-specific entities. More
specifically, an intent represents a map-ping between what a user says and what action should be
taken by the chatbot. Actions correspond to the steps the chatbot will take when specific intents
are triggered by user inputs and may have parameters for specifying detailed information about it
[28]. Intent detection is typically formulated as sentence classification in which single or multiple
intent labels are predicted for each sentence.
16
1.3.2 RASA
Rasa is an open-source reading framework for automated text and voice-based
conversations. Understand messages, hold conversations, and connect to message
channels with APIs.
It is a tool to build custom AI chatbots using Python and natural language
understanding (NLU). Rasa provides a framework for developing AI chatbots that uses
natural language understanding (NLU). It also allows the user to train the model and
add custom actions.
A chatbot solution, conversational AI framework.
Rasa stack is open source, ML framework for automated text and voice-based
conversations.
Rasa clarifies messages, holding conversations and connecting to messaging
channels and APIs.
Transparent, which means we know exactly what is happening under the hood
and an customize thing as much as we want.
It is one of the most effective and time efficient tool to build complex chatbots
in minutes.
17
1.3.3 Flask
Flask is a micro web framework in Python. It is classified as a microframework because
it does not require tools or libraries. It has no database abstraction layer, form validation,
or any other components where pre-existing third-party libraries provide common
functions. However, Flask supports extensions that can add application features as if
they were implemented in Flask itself. Extensions exist for object- relational mappers,
form validation, upload handling, various open authentication technologies and several
common framework related tools.
Applications that use the Flask framework include Pinterest and LinkedIn.
It is a small and light weight Python web frame work that provides useful tools and
features that make creating web applications in Python easier. It gives developers
flexibility and is a more accessible framework for new developers since you can build a
web application quickly using only a single Python file.
1.3.4 Chat Widget
A chat widget is a window on your website that allows visitors to have a conversation
with a sales or service rep in real-time. It usually pops-up in the bottom right corner of a
web page and prompts visitors to chat with a representative of the business. While most
chat widgets today are manned by live chat agents, chatbots are quickly becoming the
18
preferred choice for companies that need to scale their chat capabilities or offer 24/7
support.
Although it is most commonly found on websites, alive chat widget can also be
integrated into social media pages and mobile apps.
Morethanhalfofallcustomerssaidtheypreferredtochatonlinewithacompanyinstead of using
other options such as email, phone support or social media. This is because unlike other
channels, chat widgets allow visitors to reach out to a company at the time they have a
question, and get an immediate response.
1.4 ORGANIZATION
19
CHAPTER 2
LITERATURE SURVEY
2.1 RESEARCH PAPERS ANDARTICLES:
1. AresearchpaperwaspublishedintheIFIPInternationalConferenceonArtificialIntelligenceApplications and
Innovations in May 2020 An Overview of Chatbot Technology. The authors of the Paper were Eleni
Adamopoulou, Lefterismoussiades.
2. AresearchpaperwaspublishedinIJCRTnamedAnAnalyticalStudyandReviewofopenSourceChatbot
framework, RASA in June, 2020. The authors of the articles Manoj Joshi , Rakesh
KumarSharma.
3. An article was in Hindawi named A Chatbot System. The author of the article are FilippoGandino.
20
2.2 PYTHON
2.2.1 Introduction
2.2.2 History of Python
Python was founded by Guido van Rossum at Centrum Wiskunde& Informatica (CWI) within Europe
21
Python 3.9.2 and 3.8.8 have security issues with all versions of Python, which may cause remote code
emissions and cache poisoning. it was fast.
22
2.3 WORKFLOW
2.4 Neo4j
2.4.1 Neo4j is the open-source Graph Database that is developed using Java technology. It
is highly scalable and schema-free (NoSQL). Unlike traditional databases which store
data in rows, columns, and tables, it has a completely flexible structure. Therefore, it
saves relationships that connect data.
23
The architecture is for traversal of nodes and relationships, optimal management, and storage
Each node contains direct pointers to all the nodes that are connected by relationships.
2.4.2 Graph databases - A graph database is a database used to model the data in the form
of a graph. Here, the nodes of a graph depict the entities while the relationships depict
the association of these nodes.
2.4.3 Why graph database? - Nowadays, most of the data exists in the form of the
relationship between different objects, and more often, the relationship between the data
is more valuable than the data itself. Other databases even more recent node SQL types,
don't save relationships directly, they usually do a search through a separate data
structure called "index" which is expensive and makes the database slower, whereas
graph databases store relationships and connections as first-class entities. Also can easily
retrieve (traverse) connected data faster by indexing starting point and then just chasing
the memory pointer.
2.4.4The query language used - CYPHER
24
Cypher is a powerful declarative query language. It uses ASCII-art syntax.
It is easy to learn and can be used to create and retrieve relations between data without using
complex queries like Joins.
2.4.5 Use CASES: - To detect frauds, enhance AI, manage supply, unify silos, and
many more.
2.5 LIBRARIES
2.5.1 NumPy
NumPy means Numeric Python, a Python package for computing and process multidimensional and one-
dimensional array components. Travis Oliphant created the NumPy package in 2005 as well as the
practicality of the previous Numeric module in another Num array module.
25
This is a Python extension module written primarily in C. It has various functions that enable high-speed
numerical calculations. NumPy provides a variety of high-performance data structures that implement
multidimensional arrays and matrices. These DS are used for optimal calculations on mathematics.
2.5.2 Pandas
PandasisdefinedinPythonasanopensourcelibrarythathaspowerfulinformationmanipulation.Thename of this
library comes from the term panel information, which implies economic science composed of two-
dimensional information. Used for information analysis in PythonanddevelopedbyWesMcKinneyin2008.
Information analysis needs a heap of processes like, python, pandas on. However, I like pandas as a result
of their faster, easier, and a lot of communicative than different tools.
Pandas are made on the NumPy package. In different words, NumPy is needed for the panda to figure.
Before pandas were introduced, Python was prepared for information however had restricted support for
information analysis. that is wherever pandas came in, increasing the potential for information analysis. no
matter the supply of the information, you will perform the 5 key steps needed to method and analyze the
information. NS. Loading, operation, preparation, modeling, analysis.
26
CHAPTER 3
SYSTEM DEVELOPMENT
3.1 Project Development
3.1.1 Zone of the state
Figure 3- Zone of a state
27
3.1.2 Cases on date
Figure 4- cases on date
3.1.3 Growth rate of states
Figure 5- growth rate
28
3.2 REAL-TIME PROJECT
3.2.1 ADAMBOT - Adam AI chat server that will handle user queries and answer them accordingly.
Adam chat server handles the user queries using rasa, deep learning model and transformer
model. To process the request there are two services running on Chat Server and AI server.
Chat Server will be exposed to the outworld and will accepts the user requests. Out of the box it
will process the request using Rasa model.
User request will go through the Rasa model, and if Rasa does not predict the intent with good
confidence, same request will be processed through AI server that is trained on other 120
intents.
If deep learning model fails then response will be generated through the transformer model.
Project is built using Fast API server. Apis will be exposed to communicate with the server and to
predict the intent.
29
3.2.1.1 Rasa Model
We will be using rasa as python library and a model will be there to predict the intents.
User request will be served with Rasa model, if rasa predicts it correctly with more than
mentioned threshold then response will be returned.
And if the query is an answer of previous opened slot, and as we are calculating the
confidence for each query then for slot answers like email of user can have low confidence.
For such cases we will avoid processing the request usin AI Server and will return the Rasa
response only.
Otherwise, if low confidence or intent is otoscope then request will be processed using AI
server.
3.2.1.2 AI-Server - Elmo with BiLSTM Deep learning model
AI Server will handle the user query if Rasa model fails to predict the intent (that is if
*out_of_scope* is detected or the predicted confidence is lower than the set threshold value).
AI server will process the request using deep learning model that is trained with 120 intents
30
(different user queries) with the help of elmo and deep learning techniques.
If deep learning model fails to predict the intent corerctly then the request will be handled by
transformer model.
3.2.1.3 Transformer model
This is trained with all the user data, based on learnings it can generate the answer
automatically from the provided training data.
Before passing the request to transformer model, toxicity of the question will be checked
to avoid the toxic answers. If request is not toxic then transformer model will process the
user question and will generate the manual answer. The answer will be checked for
toxicity, if generated answer is good, then same answer will be returned.
31
3.2.1.4 Project Development
Figure 6 – Chat Server
Figure 7 – Chat Server process
32
Figure 8 – queries in Chat Server
Figure9 – Different queries in chat server
33
Figure 10 – AI Server
Figure 11 – Queries in AI Server
Figure 12- Different queries in AI server
34
3.2.2 ADAM-ABSA
This is an application for handling 3 DL models: Detect Emotions, Intentions and Sentiment in a user
written Text.
Processes all the models for the provided text return combined response for all the model
3.2.2.1 Intention Model
Detects user's intention from the text. It basically processes given query against intention model
and returns predicted intentions.
Model requests for:
- query: text from which intentions is to be predicted
- ClientID
- Sender ID
- Message Id: message id is requested from user but if request is sent from extension, a
unique msgID is generated.
- source: It tells from where the request has been made
Response: - Predicts text intention, confidence, start index and end index.
35
3.2.2.2 Emotion Model
Processes text against this model and detects user's emotion
Model requests for:
- query: text from which user's emotion is to be detected
- client Id
- sender Id
- message Id: message id is requested from user but if request is sent from extension,
a unique msgID is generated.
- source: It tells from where the request has been made
Response: - Predicts emotion name, confidence, start index and end index.
3.2.2.3 Sentiment Model
- Statement Model: Checks whether the given text is a statement type and of what nature ( Positive,
Negative, Neutral)
- Question Model: Checks whether the given text is a question type and of what nature ( Positive,
Negative, Neutral)
36
3.2.2.4 PROJECT DEVELOPMENT
Figure 13- All models
Figure 14- Emotion model code
Figure 15- Sentiment model code
37
Figure 16- Intention model code
Figure 17 – Predicted Emotions
38
Figure 18 – Predicted sentiments
Figure 19 – Predicted Intentions
39
3.3 HARDWARE AND SOFTWAREREQUIREMENTS:
Hardware Requirements
3.3.2 x86-64 processor with Intel core i3 or more.
3.3.3 Minimum 4GB RAM
3.3.4 Window 7 and above
Software Requirements
3.3.5 Jupyter Notebook
3.3.6 Visual Studio Code
3.4 CONCEPTSREQUIREMENTS
Artificial Intelligence
Chatbot
Rasa
Flask
Chat Widget
NumPy
Pandas
40
CHAPTER-4
PERFORMANCE ANALYSIS
4.1 STEPS OF THE PROPOSED PROCESS
Figure 5- Process
44
CHAPTER-5
CONCLUSIONS
5.1 CONCLUSION
However, there is one solution primed to satisfy the modern customer, and that is a chatbot. With a
chatbot, your organization can easily offer high-quality support and conflict resolution any time of day,
and for a large quantity of customers simultaneously.
According to Microsoft, 90% of consumers expect an online portal for customer service. As a significant
aspect of business evolution, the need for AI-powered chatbots will only continue torise. Now is the
time to deploy a chatbot solution so that your company doesn’t get left behind.
44
5.2 FUTURESCOPE
The bot still needs a lot of training data. We can integrate RASA-X for the same purpose which is not yet
added into this project. Rasa X is a toolset that layers on top of Rasa Open Source, making it easier to review
conversations, identify next steps in development, and create new training data to improve far beyond the first
version of your assistant.
44
REFERENCES
https://www.researchgate.net/publication/340534832_An_Intelligent_Chatbot_System_Based_on_E
ntity_Extraction_Using_RASA_NLU_and_Neural_Network
https://ijsrcseit.com/paper/CSEIT218331.pdf
https://arxiv.org/abs/1712.05181
https://github.com/G-Slient/rasa-covid19-chatbot
https://www.youtube.com/channel/UCJ0V6493mLvqdiVwOKWBODQ
For Frontend-Ui code: https://github.com/JiteshGaikwad/Chatbot-Widget
https://stackoverflow.com
44