SOM Algorithm and its variant

Self-Organizing Maps (SOMs) are a type of unsupervised learning algorithm used for dimensionality reduction and data visualization. SOMs transform complex, high-dimensional data into a simpler, often two-dimensional grid while preserving the relationships between data points. SOMs are a part of artificial neural networks developed by Teuvo Kohonen in the 1980s, and hence, they are sometimes referred to as Kohonen maps.

How SOM Works

Input Layer and Weight Initialization:
- Each neuron in the SOM is associated with a weight vector of the same dimension as the input data.
- These weights are initialized randomly or using heuristics.
Finding the Best Matching Unit (BMU):
- For each input vector, the SOM calculates the similarity (e.g., Euclidean distance) between the input vector and each neuron's weight vector.
- The neuron with the closest weight vector is called the Best Matching Unit (BMU).
Updating Weights:
- The weights of the BMU and its neighboring neurons are updated to make them closer to the input vector.
- This is done using the formula:
  W(t+1)=W(t)+α(t)⋅(X−W(t))W(t+1)
  Where:
- W(t): Weight vector at time 𝑡
- α(t): Learning rate
- X: Input vector
- t: Iteration count
Neighborhood Function:
- Neurons closer to the BMU in the grid receive a greater update, while those farther away receive a smaller update.
- The influence of the neighborhood decreases over time.
Iterative Process:
- The algorithm iteratively updates weights for all input vectors, leading to the organization of the map.

Example: Clustering of Colors

Imagine we want to group a set of RGB colors using a 2D SOM.

Input Data: RGB color vectors (e.g., [255, 0, 0] for red, [0, 255, 0] for green, [0, 0, 255] for blue).
SOM Grid: A 10x10 grid of neurons is initialized. Each neuron has a weight vector corresponding to RGB values.
Training:
- For each color in the dataset, the SOM finds the BMU based on the closest RGB match.
- The BMU and its neighbors in the grid are updated to resemble the input color.
Result:
- After several iterations, neurons in one part of the grid represent red shades, another part represents green shades, and yet another represents blue shades. Intermediate shades (like yellow or purple) form regions between these.

Applications of SOMs

Data Visualization: Reduce high-dimensional data to a 2D or 3D representation.
Clustering: Group similar data points without prior labeling.
Anomaly Detection: Identify unusual data patterns in network security or fraud detection.
Feature Extraction: Find patterns or structures in image and audio data.

Key Advantages

Intuitive representation of data relationships.
Useful for non-linear dimensionality reduction.
Helps in exploratory data analysis.

Visual Representation

For the above color clustering example, the final grid would look like a smooth gradient of colors, with similar colors grouped closer together. This visually demonstrates the SOM's ability to organize and map input data onto a lower-dimensional space.

Example: A Kohonen self-organizing map is used to cluster four vectors. Let the vectors to be clustered be (1,1,0,0); (0,0,0,1); (1,0,0,0);(0,0,1,1) The maximum number of clusters to be formed is m = 2. Suppose the learning rate is n = 0.5, The neighborhood of node J is set so that only one cluster updates its weights at each step (R = 0). Initial weight matrix is [0.2 0.4 0.6 0.8] [0.9 0.7 0.5 0.3].

Solving the Problem Using Kohonen Self-Organizing Map (SOM)

Given:

Input vectors: (1, 1, 0, 0), (0, 0, 0, 1), (1, 0, 0, 0), (0, 0, 1, 1)
Clusters (m): 2
Learning rate (n): 0.5
Neighborhood radius (R): 0 (Only the Best Matching Unit (BMU) updates).
Initial weight matrix:
- Cluster 1: [0.2, 0.4, 0.6, 0.8]
- Cluster 2: [0.9, 0.7, 0.5, 0.3]

Steps for Each Iteration:

Compute the Euclidean distance between each input vector and the weight vectors of the clusters to find the Best Matching Unit (BMU).
Update the BMU's weights using the formula:
New weight = Old weight + Learning rate × (Input vector − Old weight)
Repeat for all input vectors, updating weights at each step.

Iteration-by-Iteration Solution

Initial Weights:

Cluster 1: [0.2, 0.4, 0.6, 0.8]

Cluster 2: [0.9, 0.7, 0.5, 0.3]

Step 1: Input Vector (1, 1, 0, 0)

Compute Euclidean distances:
For Cluster 1:
Distance = √((1 − 0.2)² + (1 − 0.4)² + (0 − 0.6)² + (0 − 0.8)²) = √(0.64 + 0.36 + 0.36 + 0.64) = √2.0
For Cluster 2:
Distance = √((1 − 0.9)² + (1 − 0.7)² + (0 − 0.5)² + (0 − 0.3)²) = √(0.01 + 0.09 + 0.25 + 0.09) = √0.44
BMU: Cluster 2 (as its distance is smaller).
Update weights for Cluster 2:
New weight = Old weight + 0.5 × (Input vector − Old weight)
For Cluster 2: [0.9, 0.7, 0.5, 0.3] + 0.5 × ([1, 1, 0, 0] − [0.9, 0.7, 0.5, 0.3])
= [0.9, 0.7, 0.5, 0.3] + [0.05, 0.15, −0.25, −0.15]
= [0.95, 0.85, 0.25, 0.15]
Updated weights:
Cluster 1: [0.2, 0.4, 0.6, 0.8]
Cluster 2: [0.95, 0.85, 0.25, 0.15]

Step 2: Input Vector (0, 0, 0, 1)

Compute Euclidean distances:
For Cluster 1:
Distance = √((0 − 0.2)² + (0 − 0.4)² + (0 − 0.6)² + (1 − 0.8)²) = √(0.04 + 0.16 + 0.36 + 0.04) = √0.6
For Cluster 2:
Distance = √((0 − 0.95)² + (0 − 0.85)² + (0 − 0.25)² + (1 − 0.15)²) = √(0.9025 + 0.7225 + 0.0625 + 0.7225) = √2.41
BMU: Cluster 1 (as its distance is smaller).
Update weights for Cluster 1:
New weight = Old weight + 0.5 × (Input vector − Old weight)
For Cluster 1: [0.2, 0.4, 0.6, 0.8] + 0.5 × ([0, 0, 0, 1] − [0.2, 0.4, 0.6, 0.8])
= [0.2, 0.4, 0.6, 0.8] + [−0.1, −0.2, −0.3, 0.1]
= [0.1, 0.2, 0.3, 0.9]
Updated weights:
Cluster 1: [0.1, 0.2, 0.3, 0.9]
Cluster 2: [0.95, 0.85, 0.25, 0.15]

Step 3: Input Vector (1, 0, 0, 0)

Compute Euclidean distances:
For Cluster 1:
Distance = √((1 − 0.1)² + (0 − 0.2)² + (0 − 0.3)² + (0 − 0.9)²) = √(0.81 + 0.04 + 0.09 + 0.81) = √1.75
For Cluster 2:
Distance = √((1 − 0.95)² + (0 − 0.85)² + (0 − 0.25)² + (0 − 0.15)²) = √(0.0025 + 0.7225 + 0.0625 + 0.0225) = √0.81
BMU: Cluster 2 (as its distance is smaller).
Update weights for Cluster 2:
New weight = Old weight + 0.5 × (Input vector − Old weight)
For Cluster 2: [0.95, 0.85, 0.25, 0.15] + 0.5 × ([1, 0, 0, 0] − [0.95, 0.85, 0.25, 0.15])
= [0.95, 0.85, 0.25, 0.15] + [0.025, −0.425, −0.125, −0.075]
= [0.975, 0.425, 0.125, 0.075]
Updated weights:
Cluster 1: [0.1, 0.2, 0.3, 0.9]
Cluster 2: [0.975, 0.425, 0.125, 0.075]

Step 4: Input Vector (0, 0, 1, 1)

Compute Euclidean distances:
For Cluster 1:
Distance = √((0 − 0.1)² + (0 − 0.2)² + (1 − 0.3)² + (1 − 0.9)²) = √(0.01 + 0.04 + 0.49 + 0.01) = √0.55
For Cluster 2:
Distance = √((0 − 0.975)² + (0 − 0.425)² + (1 − 0.125)² + (1 − 0.075)²) = √(0.950625 + 0.180625 + 0.765625 + 0.855625) = √2.75
BMU: Cluster 1 (as its distance is smaller).
Update weights for Cluster 1:
New weight = Old weight + 0.5 × (Input vector − Old weight)
For Cluster 1: [0.1, 0.2, 0.3, 0.9] + 0.5 × ([0, 0, 1, 1] − [0.1, 0.2, 0.3, 0.9])
= [0.1, 0.2, 0.3, 0.9] + [−0.05, −0.1, 0.35, 0.05]
= [0.05, 0.1, 0.65, 0.95]
Final weights:
Cluster 1: [0.05, 0.1, 0.65, 0.95]
Cluster 2: [0.975, 0.425, 0.125, 0.075]

Result

After training, the final cluster weights are:

Cluster 1: [0.05, 0.1, 0.65, 0.95]
Cluster 2: [0.975, 0.425, 0.125, 0.075]

These represent the centroids of the clusters formed by the SOM.

Header Ads Widget

SOM Algorithm and its variant

How SOM Works

Example: Clustering of Colors

Applications of SOMs

Key Advantages

Visual Representation

Solving the Problem Using Kohonen Self-Organizing Map (SOM)

Given:

Steps for Each Iteration:

Iteration-by-Iteration Solution

Initial Weights:

Step 1: Input Vector (1, 1, 0, 0)

Step 2: Input Vector (0, 0, 0, 1)

Step 3: Input Vector (1, 0, 0, 0)

Step 4: Input Vector (0, 0, 1, 1)

Result

Post a Comment

0 Comments

Total Pageviews

Search This Blog

Subject Labels

Popular Posts

Classification/Types of Operating Systems

Introduction of Operating System

Operating-System Structure

Contact form

Sponsor

Popular Posts

Introduction to Internet services and tools

Fractional Knapsack

Kruskal's Algorithm

Ad Space

Random Posts

Recent in Sports

Popular Posts

Classification/Types of Operating Systems

Introduction of Operating System

Operating-System Structure

Menu Footer Widget