Cluster Analysis is a grouping technique. This technique works on an assumption that states that the similarity is dependent upon multiple variables. It helps to measure the proximity of the study variables. The groups that emerge out of cluster analysis are homogeneous in their own composition and heterogeneous when it comes to comparison to other groups. The grouping for cluster analysis can be done for anything ranging from objects, individuals to products and entities. The researcher identifies a set of clustering variables. These variables are the identified variables that have a significant role in classifying the objects into various groups. For this reason cluster analysis is also called a classification or grouping technique. It has a lot of use in different branches of social sciences particularly psychology, sociology, management and engineering.
Cluster analysis is different from other data reduction techniques. The similarity of course is that it analysis the function of multiple independent variables but the difference is that, in factor analysis the original correlated variables are reduced to a more manageable number gut the data reduction is carried out on the columns of the data matrix. While, in the case of cluster analysis, the focus is on the rows which could be the individuals, entities, products or any other variable.
Another data reduction technique that can be confused with cluster analysis is the discriminant analysis. In the discriminant analysis the classification and identification of similarities is a pre requisite. It is imperative here to put across the objectives and rules of similarities in order. In the case of cluster analysis, the hole population is undifferentiated and all efforts to find out the similarity in the response to variables and the grouping task is done as an outcome of the cluster analysis.
The usage of cluster analysis is widespread and it has application in all the varied branches. It is the best classification technique when the factors involved in data collection are multiple. Its main use is seen in the segmentation technique where the main task is to split the potential customers within a market into different groups. Maximum explanation from the output of the cluster analysis has been witnessed in this field of segmentation.