Business Analytics Workshop SIBM 2011 Marketing : Day 9

Saturday, September 15, 2012

Day 9 - Team H

Discriminant Analysis

Discriminant Analysis is used primarily to predict membership in two or more mutually exclusive groups. It tells us to which group each member probably belongs. It can be used to assign individuals to groups on the basis of their scores on two or more measures. From those scores, the best composite score based on least square is calculated. Then the higher R2 is the better predictor of the group membership.

The major application are for this technique is when we want to distinguish between two or three sets off objects or people, based on the knowledge of some of their characteristics. Generally, we can use linear discriminant analysis when we have to classify objects into two or more groups based on the knowledge of some variables (Characteristics) related to them. Typically, these groups would be users/non-users, potentially successful salesman/potentially unsuccessful salesman, high risk/low risk consumer, or on similar lines.

Inputs required

The model requires variable values for the independent variables and the dependent variables (non-metric)

Outputs obtained

It provides the characteristics of the discriminant function, such as variables that contribute to each discriminant function (through discriminant loading). The significance of the function is also given. The raw and standard discriminant weights are to assist in the classification of objects. Finally, the usefulness of the discriminant analysis for classification is evaluated through the hit ratio.

Assumptions

Ø The observations are a random sample;

Ø Each predictor variable is normally distributed;

Ø Each of the allocations for the dependent categories in the initial classification are correctly classified;

Ø There must be at least two groups or categories, with each case belonging to only one group so that the groups are mutually exclusive and collectively exhaustive (all cases can be placed in a group);

Ø Each group or category must be well defined, clearly differentiated from any other group(s) and natural. Putting a median split on an attitude scale is not a natural way to form groups. Partitioning quantitative variables is only justifiable if there are easily identifiable gaps at the points of division;

Ø Group sizes of the dependent should not be grossly different and should be at least five times the number of independent variables.

Limitations

Ø Inter variable correlations in the model ;

Ø Correlation of variables with the omitted variables;

Ø Change in environment condition;

Author-

Ruhi Singla (14103)

Business Analytics Workshop SIBM 2011 Marketing