Artificial Neural Networks

"Artificial neural networks (ANNs) are inspired by the structure and function of the human brain. They consist of interconnected layers of artificial neurons, which process information in a similar way to biological neurons. Unlike traditional programming, ANNs learn through training on large datasets. By adjusting the connections between these artificial neurons, they can identify complex patterns and relationships within the data. This makes them powerful tools for tasks like image recognition, speech translation, and even creative text generation."- Gemini 2024

With artificial neural networks (ANN), we aim to mimic how we believe the human brain works -- a network of nerve cells (neurons) connected by synapses the carry information in the form of electric signals. In fact, our computers themselves are based on the same concept at the lowest level of hardware (think boolean logic gates).

The ANN represented in this diagram is a screenshot from the TensorFlow Playground.

Example Implementation - Image Classification in TensorFlow
Concepts reviewed in implementation
import tensorflow as tf

# load data as train/test sets of sizes [60,000, 10,000]
# x values are input images, y values are output labels
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()

# normalize pixel values to [0, 1] range to improve learning
x_train, x_test = x_train/255.0, x_test/255.0

# Flatten images to 1D vectors for neural network input layer
x_train = x_train.reshape(len(x_train), 28 * 28)
x_test = x_test.reshape(len(x_test), 28 * 28)

# Convert target labels to one-hot encoded vectors
y_train = tf.keras.utils.to_categorical(y_train, 10)
y_test = tf.keras.utils.to_categorical(y_test, 10)

# Simple sequential model for multi-class classification
# - hidden layer of 128 neurons with ReLU activation
# - output layer of 10 neurons (10 digits) with softmax activation
model = tf.keras.models.Sequential([
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dense(10, activation='softmax')
])

# Compile model using CategoricalCrossentropy loss
# Use stochastic gradient descent (SGD) optimizer
model.compile(loss = tf.keras.losses.CategoricalCrossentropy(),
          optimizer = tf.keras.optimizers.SGD(learning_rate=0.01),
          metrics = ['accuracy']
)

# train the model
model.fit(x_train, y_train, 
      epochs=10, batch_size=32,
      validation_data=(x_test, y_test),
)

# evaluation the model
test_loss, test_acc = model.evaluate(x_test, y_test)
print('Test Loss:', test_loss)
print('Test accuracy:', test_acc)
    
Further improvements
  • experiment with different hyperparameters (learning rate, epochs, hidden layer sizes)
  • add regularization techniques like dropout to prevent overfitting
  • use more complex model architectures like convolutional neural networks