Skip to content
truthxify
← Journal

Phase 2 — Classical ML

May 26, 2026

Start the course 3 of the Machine Learning Specialization

What I Did

Implemented a k-means algorithm from scratch and applied it to image compression

What I Learned

Clustering looks at a dataset and tries to find points that are similar to each other and group them together

This can be applied in a bunch of places like astronomy for grouping astronomical objects together, market segmentation, DNA microarray data

The way it works is that we choose some points at random and we call these points centroids, then assign each point in the training example to the closest cluster centroid. Then we move each centroid to the average of the points assigned to it. We do this recursively till we get the best clusters

The optimization objective is to minimizing a specific cost function called the distortion function:

J(c(1),...,c(m),μ1,...,μk)=1mi=1mx(i)μc(i)J(c^{(1)}, ..., c^{(m)}, \mu_1, ..., \mu_k) = \frac{1}{m}\sum_{i=1}^m\left|\left|x^{(i) - \mu_{c^{(i)}}}\right|\right|

The distortion function is the average squared distance from each training example to the centroid it has been assigned to

This does two things:

  • The assignment step minimizes JJ over c(i)c^{(i)} with the μk\mu_k held fixed. Closest centroid is the best choice
  • The move step minimizes JJ over μk\mu_k with c(i)c^{(i)} held fixed. The mean is the point that minimizes the sum of the squared distance to a set of points

JJ should always go down or stay flat(converged), it should never increase. if JJ goes up anytime, that means there is a bug in the implementation

The way we initialize the centroids is:

  • Choose K<mK < m where m is the number of training examples
  • Randomly pick KK distinct training examples
  • Set μ1,μ2,...,μk\mu_1, \mu_2, ..., \mu_k equal to those KK examples

We run this multiple times about 50-1000(anything past that gives diminishing returns) and compute JJ for each runs and pick the one with the smallest JJ

The way we pick the number of clusters depends on us, we can pick the one that best serve the purpose of what we want to do

Bugs & Blockers

N/A

Concepts That Need More Time

How we are able to use for image compression, it seems a bit odd to me at first but i figured it out, I'm currently thinking about more things it can be applied to and would need some time to find a natural fit for it.

Tomorrow

Continue the course 3 of the Machine Learning Specialization

Wins

Implemented K-Means algorithm and applied it to image compression Google Colab