Lecture7_Neural Networks_and_analysis2024.pdf

2031ICT Data Analytics Methods Lecture 7: AI and Neural Networks

Outline § Development of Artificial Intelligence (AI) § Neural networks

Development of AI § Research heavily funded by US Department of Defence in 1960’s § Progress slowed in late 1970’s, AI winter § 1980s, expert system, fifth generation computer, AI winter § Late 1990s to early 2000s, logistics, data mining, medical diagnosis, computer games § Since late-2000s, access to large mount of data, faster computers enabled advances in machine learning and perception. § Another AI spring – investment on technology innovation.

http://www.dailymail.co.uk/sciencetech/article-4560824/AI-outperform-humans- tasks-just-45-years.html#v-2224705489147480236 Early surprises – Beating the Go and Chess Masters AlphaGo is a computer program that plays the board game Go. It was developed by DeepMind Technologies which is a subsidiary of Google. It beat human champion Lee Sedol in 2016.

Early surprises – Beating the Go and Chess Masters In fact, it was not the first time that computer beat a human champion in board games. In 1997, IBM super-computer Deep Blue defeated Garry Kasparov, the world chess champion Garry Kasparov by 3½–2½ in a 6-game match.

Search space - Chess: 3580, Go: 250150 High complexity

Recent Breakthroughs – Beating the World Game Champions https://www.youtube.com/watch?v=tfb6aEUMC04

When Will AI Exceed Human Performance? Prediction from AI Experts A survey run at International Conference on Machine Learning in July 2015 and the Neural Information Processing Systems conference in December 2015. 352 responses. Black dots show the mean guess in a range of answers indicated by the black lines.

When Will AI Exceed Human Performance? Prediction from AI Experts Happened several months after survey

Recent Breakthroughs – ChatGPT and more § A conversational AI model that § Comprehends user inputs (text data). § Generates coherent and contextually relevant responses. § GPT stands for “Generative Pre-Trained Transformers” § “Generative”: able to generate new text based on what it has learned § “Pre-trained”: § Using large corpus - hundreds of terabytes of (pre-processed) text data. § Using unsupervised learning to annotate data with attributes or labels without explicit human annotation and then using supervised learning on labelled data § Integrating the training process (e.g., from GPT-3 to GPT-3.5 to GPT-4) § “Transformer”: a type of neural network architecture that captures dependencies and relationships between different words in a sequence of text. § Next: OpenAI’s Sora generating videos from text inputs § Great impact on people’s daily life

How brain works § All the breakthroughs mentioned earlier relate to (artificial) neural networks which imitate human brains § A typical brain contains close to 100 billion minuscule cells called neurons. Each neuron is made up of a cell body with a number of connections coming off it: numerous dendrites (the cell’s inputs—carrying information toward the cell body) and a single axon (the cell’s output—carrying information away). § Dendrites extend from the neuron cell body and receive messages from other neurons. When neurons receive or send messages, they transmit electrical impulses along their axons that aid in carrying out functions such as storing memories, controlling muscles, and more. https://clevertap.com/blog/neural-networks/

Neural networks § Artificial neural networks (ANNs), usually simply called neural networks (NNs), are computing systems inspired by the biological neural networks that constitute animal brains. § An ANN is based on a collection of connected units or nodes called artificial neurons, which loosely model the neurons in a biological brain. § Each connection, like the synapses in a biological brain, can transmit a signal to other neurons. An artificial neuron receives a signal then processes it and can signal neurons connected to it.

Neural networks § The "signal" at a connection is a real number, and the output of each neuron is computed by some non-linear function of the sum of its inputs. § The connections are called edges. § Neurons and edges typically have a weight that adjusts as learning proceeds. The weight increases or decreases the strength of the signal at a connection. § Neurons may have a threshold such that a signal is sent only if the aggregate signal crosses that threshold. Typically, neurons are aggregated into layers. § Different layers may perform different transformations on their inputs. Signals travel from the first layer (the input layer), to the last layer (the output layer), possibly after traversing the layers multiple times.

Neural networks – input and initialisation 28 pixels 28 pixels 1 2 3 4 10 5 6 7 8 9 • The number nodes of the input layer and output layers are normally fixed. The sizes of hidden layers can be changes and it is a hyper-parameter of the neural network that shall be tuned for the optimal recognition performance. • In the digit recognition example, the input image is 28 by 28 pixels, so the input layer has 784 nodes, each corresponding to a pixel. • The input layer then passes the input to the first hidden layer. • The output layer has 10 classes, corresponding to 10 digit classes from 0 to 9.

Neural networks – hidden layer § Think of each individual node in the hidden layer as its own linear regression model, composed of input data, weights, a bias (or threshold), and an output. The formula would look something like this: § These weights 𝑤! help to determine the importance of the corresponding input, with larger ones contributing more significantly to the output compared to other inputs. The weights are assigned randomly at the beginning, and they will be updated during the learning process. § All inputs are then multiplied by their respective weights and then summed. 3 input nodes:

Neural networks - activation § Afterward, the output is passed through an activation function, which determines the output. § If that output exceeds a given threshold, it “fires” (or activates) the node, passing data to the next layer in the network. This results in the output of one node becoming in the input of the next node. § This process of passing data from one layer to the next layer defines this neural network as a feedforward network. Activation function for the linear input function :

Neural networks – model § There are different types of activation functions

Neural networks - training § The information in the networks are passed from layer to layer until they reach the last layer. The node in the last layer with the highest value is considered as the classification outcome. § Errors happen initially, e.g. 5 may be classified as 10. § Error in the output is back-propagated through the network and weights are adjusted to minimize the error rate. § This is calculated by a cost function. You keep adjusting the weights until they fit all the different training models you put in. § The loss on the training set continues to drop. § This may cause overfitting. § A validations set is used to determine when to stop training.

Neural networks – training example Click here to try

Types of neural networks - perceptron § The perceptron is the oldest neural network created in 1958. It has a single neuron.

Types of neural networks – feedforward networks § Feedforward neural networks, or multi-layer perceptrons (MLPs) are comprised of an input layer, a hidden layer or layers, and an output layer. Data usually is fed into these models to train them, and they are the foundation for many other neural networks.

Types of neural networks – convolutional networks § Convolutional neural networks (CNNs) are widely used for pattern recognition and computer vision. They use convolutional layers to extract image features. It is also one of the most commonly used network structure in deep learning. https://towardsdatascience.com/covolutional-neural-network-cb0883dd6529 2D convolution

Types of neural networks – recurrent neural networks § A recurrent neural network (RNN) is a class of artificial neural networks where connections between nodes form a directed or undirected graph along a temporal sequence. This allows it to exhibit temporal dynamic behaviour (a time-ordered sequence of observations). § Recurrent neural networks (RNNs) are widely used in time-series data to make predictions about future outcomes, such as stock market predictions or sales forecasting. https://tex.stackexchange.com/questions/494139/how-do-i-draw-a-simple- recurrent-neural-network-with-goodfellows-style

§ End of the Lecture 7 § Thank you!

Lecture7_Neural Networks_and_analysis2024.pdf

More Related Content

Similar to Lecture7_Neural Networks_and_analysis2024.pdf

Recently uploaded

Lecture7_Neural Networks_and_analysis2024.pdf