Unsupervised Learning Flashcards
(20 cards)
What is unsupervised learning?
A type of machine learning where the model learns patterns from data without labeled outputs.
What is clustering?
Grouping similar data points together based on feature similarity.
Which library provides KMeans in Python?
from sklearn.cluster import KMeans
How do you create a KMeans model with 3 clusters?
KMeans(n_clusters=3)
How do you fit and predict cluster labels in one step?
kmeans.fit_predict(X)
How do you access cluster labels after prediction?
Use the output of fit_predict(), or kmeans.labels_
How do you view the coordinates of cluster centers?
kmeans.cluster_centers_
What is inertia_ in KMeans?
The sum of squared distances of samples to their closest cluster center.
What is the Elbow Method?
A technique to find the optimal number of clusters by plotting inertia vs. number of clusters.
How do you scale features before clustering?
Use MinMaxScaler or StandardScaler from sklearn.preprocessing
How do you plot an elbow curve?
Loop over k, store km.inertia_, then plot SSE vs k using matplotlib.
Why do we use feature scaling before clustering?
To ensure all features contribute equally to distance calculations.
How do you visualize clusters in 2D using matplotlib?
Use plt.scatter() for each cluster and plot centroids with ‘x’ markers.
How do you assign cluster labels back to the DataFrame?
df[‘cluster’] = kmeans.fit_predict(X)
How do you print the unique cluster labels in a DataFrame?
df[‘cluster’].unique()
How do you check the number of records in each cluster?
df[‘cluster’].value_counts()
How do you train a clustering model?
Create the KMeans model, then call fit() or fit_predict() on the data.
How do you remember how to train KMeans?
Think ‘Create → Fit → Predict’ (C → F → P) using fit_predict()
How do you create an elbow plot to choose optimal k?
Loop through k-values, fit model, record inertia, then plot.