“Are magic and technology really so different?” This is a question which I have been trying to answer for a long time. For as long as I could remember, new advancements in robotics and computers have fascinated me. I wanted to innovate, just like Elon Musk and Boston Dynamics. I wanted to innovate in the fields of AI and robotics. I thought, and still think in wonder, “How do people even make machines like that!” So, I started learning about it, especially machine learning, one of the most useful branches of AI.
Machine learning is divided into two fields: supervised and unsupervised. Supervised learning is where the answers (labels) are given with the data, so the computer can learn from the correct answers. For example, classifying dogs and cats based on labelled images is supervised. Unsupervised learning has no labels. Example for this is Google News. It collects different news articles and clusters them based on the topic.
I decided to start with supervised learning. The basic model for a machine learning program has an activation function, which is applied to (weights*input + biases), where the weights and biases are variables. The job of the model is to find the best possible values for weights and biases. This may seem simple, but complex networks can have over 2 lakh variables. To find out how well the model has performed, the accuracy and loss are calculated with the validation data. Loss is a measure of how good the model has done. The lower the loss is, the better. An optimiser then tweaks the variables, to try and increase accuracy and reduce loss.
The most simple network is linear regression. Try this out; I generated my own dataset and used TensorBoard for visualisation of the model. The equation outputs a line (weights*input + bias). The line shows the data trend, and can predict the output for any data input. When you are starting with machine learning, python 3 is the way to go. Compared to java and C based languages, the learning curve is very less. It also has a rich support of machine learning libraries like tensorflow, keras, scikitlearn, etc. I use tensorflow for all my projects.
Next comes logistic regression, where an activation function (either softmax or sigmoid) is applied. Sigmoid is used for binary classification and Softmax is used for multi-class classification. These functions make more complex decision boundaries, rather than just a line and help in accurate classification. I generated my own data for binary classification with logistic regression. For multi-class classification I used the Iris dataset.
Now come neural nets. They consist of layers of logistic regressions, where the output of one layer is the input to the next. Each layer consist of a fixed number of neurons, each of which does a weighted sum of all the neurons in the layer above it. A new activation function called relu is used for these middle layers or hidden layers. A good problem to get hands on experience with this is the Titanic Problem. Based on data about passengers such as fare, cabin, age, family, etc. you have to predict whether the passenger will survive or not.
For problems where the data is not in the form of images, simple neural nets work great. But, networks called Convolutional Neural Networks (CNNs) work better for images. It is because neural nets require the data to be flattened into 1 dimension. If you flatten images, we lose the curves and shapes. CNN do not flatten the images, rather, every layer consists of several sub-layers. The neurons in every sub-layer do a weighted sum of only a small portion of the points above it with the same weights. So, CNNs have a large number of neurons with approximately the same number of variables as simple neural nets.
A big part of machine learning is data cleaning or wrangling. This is a necessary step, because data may be in the form of words, images, or even sounds. But a computer only works with numbers. There may be unnecessary fields which can throw off the model. So we must clean the data. In fact, many problems depend on the quality of data, and it is a quintessential step in machine learning.
After reading this, you may start to panic. All the terms like ‘sigmoid’ and ‘weighted sum’ can be very intimidating. When you see the formulae, they seem more so. However, they have simple meanings. To tackle this problem, I focused on the concept and the intuition of the formulae and equations, rather than cramming the exact derivations.
My first AI project with CNNs was the classification of hand-written digits using MNIST dataset. This is a great project to start with as it fairly easy. Also, it is great for exploring different concepts of machine learning because it doesn’t involve much data cleaning, so you can focus on the neural net. There is not much scope for overfitting either, which I will discuss later. To get more practice, I started finding problems on kaggle and tried my hand at those. Each had its own complications. With every problem I solved, I took another step into the field of AI. With each, I discovered there is no end to AI. There are so many subfields and techniques, and more are being discovered everyday.
After solving these problems, I found that overfitting is a pressing issue. It means that the model is getting used to specific data points, rather than focusing on the overall trend. This can cause huge loss in testing, and is the most insistent problem in modern day machine learning. To address this issue, algorithms called regularisation have to be applied. They control the rate at which the model learns. One of this is called learning rate decay. It reduces the learning rate of the model over time. Dropout is another popular method. It removes some neurons from the network over time, controlling overfitting. Regularisation parameters called L1 and L2 regularisation can be added. They add a penalty to the loss function every time the program makes a mistake. One of the best regularising methods is batch normalisation. It tweaks the data at every layer of the network, adding some noise and giving more control to individual layers rather than the network as a whole.
I then decided that I wanted to do an internship in the field of AI to expand my horizon. I got one at boxx.ai, a small company which provides AI solutions to small online retail stores. My task was to output all the features of some clothing given its picture. It was a problem which required two machine learning algorithms to be made, each of a different kind. One was multi-class classification, to predict what type of clothe it is, eg, T-shirt, jeans, dress, etc. It means, there is only one correct answer. This was like other problems I’d done before but much harder. The next network was multi-label classification, meaning there can be more than one correct answer, eg pattern (floral, polka, etc), colour, texture and material, etc. My mentor at boxx.ai never left my side, and introduced me to many new concepts. Here is when I learnt about Google Colab and use of pre-trained models.
Finally, after fixing countless bugs, and clearing all hurdles, I was able to proudly report high accuracy rates in the program. It felt very rewarding to work on a real-world problem in an office environment. Internships are a great way to get introduced to office work culture, and to get hands-on experience working on actual assignments. It is where you get to apply all the skills learnt, and find out how much harder everything is in the real world, rather than on paper.
An important thing I realised is how to go about tough problems. The first thing is to break it down to small achievable goals. This ensures the number of bugs isn’t overwhelming. If you are stuck at one task for a long time, try looking at it from a different angle. Many a times, one line of code in machine learning is fine by itself, but causes issues for other functions. If this still doesn’t work, take a break. Do something relaxing, or come back to the problem tomorrow. I have found that answers often hit me in an epiphany when I return to the problem with a fresh mind.
So far, this journey has been staggering. Just six months ago, I would have thought problems like these impossible. AI has opened up my mind to new concepts, new ideas, and changed my lines of thought. Solving any new problem, conquering any new frontier makes us see the world through a different lens. I feel that this is a change for the better, and I will continue to expand my imagination by challenging it.