
Unsupervised ML
K-Means, hierarchical clustering, DBSCAN, PCA, t-SNE, UMAP, silhouette score, elbow method
1What is the main difference between supervised and unsupervised learning?
What is the main difference between supervised and unsupervised learning?
Answer
Unsupervised learning works with unlabeled data, seeking to discover hidden structures or patterns without a predefined target variable. Unlike supervised learning which predicts a known value (label), unsupervised learning explores data to find natural groups, reduce dimensionality, or detect anomalies. Algorithms like K-Means, PCA, or DBSCAN are typical examples of unsupervised learning.
2How does the K-Means algorithm work to partition data?
How does the K-Means algorithm work to partition data?
Answer
K-Means is an iterative algorithm that partitions data into K clusters. It randomly initializes K centroids, then alternates between two steps: assigning each point to the nearest centroid (assignment step) and recalculating centroid positions as the mean of assigned points (update step). The algorithm converges when assignments no longer change or after a maximum number of iterations.
3Which method should be used to determine the optimal number of clusters K in K-Means?
Which method should be used to determine the optimal number of clusters K in K-Means?
Answer
The elbow method plots inertia (sum of squared distances between each point and its centroid) against K. The point where the curve forms an elbow indicates optimal K, as beyond it adding clusters no longer significantly improves inertia. This method is complemented by silhouette score to validate cluster quality.
What does the silhouette score measure in the context of clustering?
What is the range of silhouette score values and how to interpret a score of 0.7?
What major limitation of K-Means makes the algorithm unsuitable for non-spherical cluster shapes?
+19 interview questions
Master Data Science & ML for your next interview
Access all questions, flashcards, technical tests and interview simulators.
Start for free