Neural networks are a fundamental part of modern artificial intelligence and machine learning. They are inspired by the structure of the human brain and are used to recognize patterns, make predictions, and process complex data. These networks consist of interconnected layers of artificial neurons that process information in a way similar to biological neurons. They excel at solving problems in diverse fields such as computer vision, natural language processing, financial forecasting, and robotics. With advancements in computational power and algorithms, neural networks have become increasingly powerful, leading to breakthroughs in self-driving cars, real-time translation, and personalized recommendations.
In this article, we will explore how neural networks work and implement a simple one in Ruby.
What is a Neural Network?
A neural network consists of layers of neurons (also called nodes) that are connected by weights. It typically has three types of layers:
- Input Layer: Receives the raw data.
- Hidden Layers: Perform calculations using weighted connections and activation functions.
- Output Layer: Produces the final result.
Each neuron processes the input, applies an activation function, and passes the result to the next layer.
What is an Activation Function?
Activation Functions in Neural Networks
An activation function is a function that's applied to the output of a neuron. It determines whether the neuron should be "activated" or not. There are many different activation functions, each with its own advantages and disadvantages.
Characteristics of a Good Activation Function
A good activation function should be:
- Non-linear: This is important because it allows the network to learn complex patterns in the data.
- Differentiable: This is important for the training process, as it allows us to calculate the gradient of the error function.
- Computationally efficient: This is important because neural networks can be very large, and we need to be able to compute the activation function quickly.
Popular Activation Functions
Sigmoid
- Outputs a value between 0 and 1.
- Often used in the output layer of a network for binary classification problems.
class Sigmoid def self.call(x) 1.0 / (1.0 + Math.exp(-x)) end end
Tanh (Hyperbolic Tangent)
- Outputs a value between -1 and 1.
- Similar to Sigmoid but centered at 0, making it easier to train.
class Tanh def self.call(x) Math.tanh(x) end end
ReLU (Rectified Linear Unit)
- Outputs 0 if the input is negative, and the input itself if the input is positive.
- Often used in the hidden layers of a network to help prevent the vanishing gradient problem.
class ReLU def self.call(x) [0, x].max end end
Softmax (For Multi-Class Classification)
- Converts outputs into probabilities.
- Computationally expensive.
# Softmax Activation Function class Softmax def self.call(values) exp_values = values.map { |v| Math.exp(v) } sum_exp = exp_values.sum exp_values.map { |v| v / sum_exp } end end puts Softmax.call([2.0, 1.0, 0.1]).inspect # Output: Probabilities summing to 1
Activation functions are a key part of neural networks, enabling them to learn complex relationships in data efficiently.
Implementing a Simple Neural Network in Ruby
Let's build a single-layer neural network that takes two inputs and predicts an output.
class SimpleNeuralNetwork attr_accessor :weights, :bias def initialize @weights = [rand, rand] # Two random weights @bias = rand # Random bias end def forward(inputs) sum = inputs[0] * @weights[0] + inputs[1] * @weights[1] + @bias Sigmoid.call(sum) # Using Sigmoid activation end def train(inputs, target, learning_rate = 0.1) prediction = forward(inputs) error = target - prediction # Adjust weights and bias @weights[0] += learning_rate * error * inputs[0] @weights[1] += learning_rate * error * inputs[1] @bias += learning_rate * error end end # Example Usage nn = SimpleNeuralNetwork.new puts nn.forward([1, 0]) # Predict output nn.train([1, 0], 1) # Train with expected output 1 puts nn.forward([1, 0]) # Check new prediction
Explanation:
- We initialize the network with random weights and bias.
- The forward method computes the weighted sum and applies the Sigmoid activation function.
- The train method updates the weights based on the error.
Simple Sentiment Analysis Neural Network in Ruby
In this tutorial, we'll build a neural network from scratch in Ruby that can analyze the sentiment of text inputs. Our network will be able to classify text as negative, neutral, or positive. While this is a simplified implementation, it provides a good foundation for understanding how sentiment analysis and neural networks work.
require 'matrix' require 'set' class SentimentNeuralNetwork def initialize @vocabulary = Set.new @word_to_index = {} @input_size = 0 @hidden_size = 64 @output_size = 3 # negative, neutral, positive # Pre-trained weights will be initialized after vocabulary building @weights1 = nil @weights2 = nil @bias1 = nil @bias2 = nil end def preprocess_text(text) # Convert to lowercase and split into words words = text.downcase.gsub(/[^a-z\s]/, '').split # Remove common stop words stop_words = Set.new(['the', 'a', 'an', 'and', 'or', 'but', 'in', 'on', 'at', 'to', 'for', 'of', 'with', 'by']) words.reject! { |word| stop_words.include?(word) } words end def build_vocabulary(training_texts) training_texts.each do |text| words = preprocess_text(text) @vocabulary.merge(words) end @vocabulary.each_with_index do |word, index| @word_to_index[word] = index end @input_size = @vocabulary.size initialize_weights end def initialize_weights # Initialize weights with Xavier/Glorot initialization xavier_init1 = Math.sqrt(6.0 / (@input_size + @hidden_size)) xavier_init2 = Math.sqrt(6.0 / (@hidden_size + @output_size)) @weights1 = Matrix.build(@input_size, @hidden_size) { rand(-xavier_init1..xavier_init1) } @weights2 = Matrix.build(@hidden_size, @output_size) { rand(-xavier_init2..xavier_init2) } @bias1 = Matrix.build(1, @hidden_size) { 0.0 } @bias2 = Matrix.build(1, @output_size) { 0.0 } end def text_to_vector(text) vector = Array.new(@input_size, 0) words = preprocess_text(text) words.each do |word| if @word_to_index.key?(word) vector[@word_to_index[word]] += 1 end end # Normalize the vector sum = vector.sum.to_f sum = 1.0 if sum == 0 vector.map! { |x| x / sum } vector end def sigmoid(x) 1.0 / (1.0 + Math.exp(-x)) end def softmax(x) exp_x = x.map { |val| Math.exp(val) } sum = exp_x.sum exp_x.map { |val| val / sum } end def forward(input_vector) input_matrix = Matrix[input_vector] # Hidden layer hidden = (input_matrix * @weights1 + @bias1).map { |x| sigmoid(x) } # Output layer output = (hidden * @weights2 + @bias2).to_a[0] softmax(output) end def analyze_sentiment(text) # Convert text to vector input_vector = text_to_vector(text) # Forward pass output = forward(input_vector) # Get prediction sentiment_index = output.index(output.max) sentiment = ['negative', 'neutral', 'positive'][sentiment_index] confidence = output[sentiment_index] { sentiment: sentiment, confidence: confidence, probabilities: { negative: output[0], neutral: output[1], positive: output[2] } } end # Pre-train the network with some basic examples def pretrain training_data = [ ["I love this, it's amazing!", "positive"], ["This is great!", "positive"], ["What a wonderful day", "positive"], ["I don't like this at all", "negative"], ["This is terrible", "negative"], ["I hate this", "negative"], ["It's okay", "neutral"], ["This is fine", "neutral"], ["Not bad, not great", "neutral"] ] build_vocabulary(training_data.map(&:first)) # Simple training loop (in a real implementation, this would be more sophisticated) training_data.each do |text, label| input_vector = text_to_vector(text) target = case label when "negative" then [1, 0, 0] when "neutral" then [0, 1, 0] when "positive" then [0, 0, 1] end # Update weights (simplified training) output = forward(input_vector) error = target.zip(output).map { |t, o| t - o } # Backpropagation would go here in a full implementation end end end # Example usage if __FILE__ == $0 # Create and train the network network = SentimentNeuralNetwork.new network.pretrain # Test some examples test_texts = [ "I really love this product!", "This is the worst experience ever", "It's an okay service, nothing special", "The quality is amazing", "I'm not sure how I feel about this" ] puts "\nSentiment Analysis Results:\n\n" test_texts.each do |text| result = network.analyze_sentiment(text) puts "Text: #{text}" puts "Sentiment: #{result[:sentiment]} (Confidence: #{(result[:confidence] * 100).round(2)}%)" puts "Probabilities:" result[:probabilities].each do |sentiment, prob| puts " #{sentiment}: #{(prob * 100).round(2)}%" end puts "\n" end end
Breaking Down the Implementation
1. Network Architecture
Our neural network has three layers:
def initialize @vocabulary = Set.new @word_to_index = {} @input_size = 0 # Will be set based on vocabulary size @hidden_size = 64 # 64 neurons in hidden layer @output_size = 3 # 3 possible outputs (negative, neutral, positive) # Weights and biases @weights1 = nil @weights2 = nil @bias1 = nil @bias2 = nil end
- Input Layer: Size depends on vocabulary size
- Hidden Layer: 64 neurons with sigmoid activation
- Output Layer: 3 neurons with softmax activation
2. Text Preprocessing
The text preprocessing step is crucial for converting raw text into a format our neural network can understand:
def preprocess_text(text) # Convert to lowercase and split into words words = text.downcase.gsub(/[^a-z\s]/, '').split # Remove common stop words stop_words = Set.new(['the', 'a', 'an', 'and', 'or', 'but', 'in', 'on', 'at', 'to', 'for', 'of', 'with', 'by']) words.reject! { |word| stop_words.include?(word) } words end
This method:
- Converts text to lowercase
- Removes punctuation using regex
- Splits into words
- Removes common stop words
3. Vocabulary Building
The vocabulary system converts words into numerical vectors:
def build_vocabulary(training_texts) training_texts.each do |text| words = preprocess_text(text) @vocabulary.merge(words) end @vocabulary.each_with_index do |word, index| @word_to_index[word] = index end @input_size = @vocabulary.size initialize_weights end
This creates a mapping between words and indices, which is used to create input vectors.
4. Weight Initialization
We use Xavier/Glorot initialization for better training stability:
def initialize_weights xavier_init1 = Math.sqrt(6.0 / (@input_size + @hidden_size)) xavier_init2 = Math.sqrt(6.0 / (@hidden_size + @output_size)) @weights1 = Matrix.build(@input_size, @hidden_size) { rand(-xavier_init1..xavier_init1) } @weights2 = Matrix.build(@hidden_size, @output_size) { rand(-xavier_init2..xavier_init2) } @bias1 = Matrix.build(1, @hidden_size) { 0.0 } @bias2 = Matrix.build(1, @output_size) { 0.0 } end
5. Text to Vector Conversion
Converting text input into numerical vectors:
def text_to_vector(text) vector = Array.new(@input_size, 0) words = preprocess_text(text) words.each do |word| if @word_to_index.key?(word) vector[@word_to_index[word]] += 1 end end # Normalize the vector sum = vector.sum.to_f sum = 1.0 if sum == 0 vector.map! { |x| x / sum } vector end
6. Forward Propagation
The forward pass through the network:
def forward(input_vector) input_matrix = Matrix[input_vector] # Hidden layer hidden = (input_matrix * @weights1 + @bias1).map { |x| sigmoid(x) } # Output layer output = (hidden * @weights2 + @bias2).to_a[0] softmax(output) end
7. Sentiment Analysis
The main method for analyzing text sentiment:
def analyze_sentiment(text) # Convert text to vector input_vector = text_to_vector(text) # Forward pass output = forward(input_vector) # Get prediction sentiment_index = output.index(output.max) sentiment = ['negative', 'neutral', 'positive'][sentiment_index] confidence = output[sentiment_index] { sentiment: sentiment, confidence: confidence, probabilities: { negative: output[0], neutral: output[1], positive: output[2] } } end
Using the Network
Here's how to use the sentiment analysis network:
# Create and train the network network = SentimentNeuralNetwork.new network.pretrain # Analyze some text result = network.analyze_sentiment("I really love this product!") puts result[:sentiment] # => "positive" puts result[:confidence] # => confidence score puts result[:probabilities] # => Hash of all probabilities
Limitations and Possible Improvements
This implementation has several limitations:
Simple Architecture: The network uses a basic feed-forward architecture. More complex architectures like LSTM or transformers would perform better.
-
Limited Training: The pre-training is very basic. A production system would need:
- Larger training dataset
- Proper backpropagation
- Cross-validation
- Learning rate optimization
-
Basic Text Processing: The text preprocessing could be improved with:
- Better tokenization
- Lemmatization
- N-gram support
- Word embeddings
No Context Understanding: The network doesn't understand context, sarcasm, or complex language patterns.
This implementation provides a foundation for understanding how sentiment analysis works with neural networks. While it's not production-ready, it demonstrates the key concepts and can be extended for more sophisticated applications.
Remember that real-world sentiment analysis systems typically use more advanced techniques and pre-trained models, but building a simple version helps understand the fundamentals.
Conclusion
Neural networks are powerful tools for solving complex problems. In this article, we implemented a simple neural network in Ruby, explored different activation functions, and saw how learning happens. While Ruby is not the most common language for deep learning, this implementation provides a fundamental understanding of how neural networks operate under the hood.
References
- Ian Goodfellow, Yoshua Bengio, Aaron Courville - Deep Learning (MIT Press)
- Michael Nielsen - Neural Networks and Deep Learning (Online Book)
- Andrew Ng - Machine Learning Specialization (Coursera)
- Geoffrey Hinton - Lecture Notes on Neural Networks
- Ruby Documentation - https://ruby-doc.org/
Top comments (0)