Tuesday, September 4, 2012

Day 2 - Team H


In Day- 2 of Business Analysts class we were taught the purpose of Cluster Analysis, how to use SPSS to perform Cluster Analysis and how to interpret the SPSS output of Cluster Analysis.


In a nutshell:

You start out with a number of cases and want to subdivide them into homogeneous groups. First, you choose the variables on which you want the groups to be similar.
Next, you must decide whether to standardize the variables in some way so that they all contribute equally to the distance or similarity between cases. Finally, you have to decide which clustering procedure to use, based on the number of cases and types of variables that you want to use for forming clusters.
For hierarchical clustering, you choose a statistic that quantifies how far apart (or similar) two cases are. Then you select a method for forming the groups. Because you can have as many clusters as you do cases (not a useful solution!), your last step is to determine how many clusters you need to represent your data. You do this by looking at how similar clusters are when you create additional clusters or collapse existing ones.
In k-means clustering, you select the number of clusters you want. The algorithm iteratively estimates the cluster means and assigns each case to the cluster for which its distance to the cluster mean is the smallest.
In two-step clustering, to make large problems tractable, in the first step, cases are assigned to “pre-clusters.” In the second step, the pre-clusters are clustered using the hierarchical clustering algorithm. You can specify the number of clusters you want or let the algorithm decide based on preselected criteria.

Cluster analysis is a major technique for classifying a ‘mountain’ of information into manageable meaningful piles. It is a data reduction tool that creates subgroups that are more manageable than individual datum. Like factor analysis, it examines the full complement of inter-relationships between variables. Both cluster analysis and discriminant analysis are concerned with classification. However, the latter requires prior knowledge of membership of each cluster in order to classify new cases. In cluster analysis there is no prior knowledge about which elements belong to which clusters. The grouping or clusters are defined through an analysis of the data. Subsequent multi-variate analyses can be performed on the clusters as groups.


Cluster: A group of relatively homogeneous cases or observations.

Cluster analysis: The statistical method of partitioning a sample into Homogeneous classes to produce an operational classification.


Purpose of cluster analysis:

Clustering occurs in almost every aspect of daily life. A factory’s Health and Safety
Committee may be regarded as a cluster of people. Supermarkets display items of similar nature, such as types of meat or vegetables in the same or nearby locations. Biologists have to organize the different species of animals before a meaningful description of the differences between animals is possible. In medicine, the clustering of symptoms and diseases leads to taxonomies of illnesses. In the field of business, clusters of consumer segments are often sought for successful marketing strategies.

Using cluster analysis, a customer ‘type’ can represent a homogeneous market segment.
Identifying their particular needs in that market allows products to be designed with greater precision and direct appeal within the segment. Targeting specific segments is cheaper and more accurate than broad-scale marketing. Customers respond better to segment marketing which addresses their specific needs, leading to increased market share and customer retention.


Author:
Akanksha Durgvanshi

No comments:

Post a Comment