Clustering in unsupervised learning is the process of grouping unlabelled data into clusters based on their similarities. The goal of clustering is to identify patterns and relationships in the data without any prior knowledge of the data’s meaning.
Formal definition
Given: a collection of input vectors , no target outputs are given Task: group the input examples into a finite number of clusters so that the examples:
- From each cluster are similar to each other
- From different clusters are dissimilar to each other