Monday, September 17, 2012

Day 10 - Team F (Rachit )


Disciminant analysis
Discriminant Function Analysis (DA) undertakes the same task as multiple linear regression
by predicting an outcome. However, multiple linear regression is limited to cases where the
dependent variable on the Y axis is an interval variable so that the combination of predictors
will, through the regression equation, produce estimated mean population numerical
Y values for given values of weighted combinations of X values. But many interesting
variables are categorical, such as political party voting intention, migrant/non-migrant status,
making a profi t or not, holding a particular credit card, owning, renting or paying a mortgage
for a house, employed/unemployed, satisfi ed versus dissatisfi ed employees, which customers are likely to buy a product or not buy.
DA is used when:
the dependent is categorical with the predictor IV’s at interval level such as age, income,
attitudes, perceptions, and years of education, although dummy variables can be used
as predictors as in multiple regression. Logistic regression IV’s can be of any level of
measurement.
there are more than two DV categories, unlike logistic regression, which is limited to a
dichotomous dependent variable.
Assumptions of discriminant analysis
The major underlying assumptions of DA are:
·         The observations are a random sample
·         Each predictor variable is normally distributed
·         each of the allocations for the dependent categories in the initial classifi cation are
·         correctly classifi ed;
·         there must be at least two groups or categories, with each case belonging to only one
·         group so that the groups are mutually exclusive and collectively exhaustive (all cases
·         can be placed in a group);
·         each group or category must be well defi ned, clearly differentiated from any other
·         group(s) and natural. Putting a median split on an attitude scale is not a natural way to
·         form groups. Partitioning quantitative variables is only justifi able if there are easily
·         identifi able gaps at the points of division;
·         for instance, three groups taking three available levels of amounts of housing loan;
·         the groups or categories should be defi ned before collecting the data;
·         the attribute(s) used to separate the groups should discriminate quite clearly between
·         the groups so that group or category overlap is clearly non-existent or minimal;
·         group sizes of the dependent should not be grossly different and should be at least fi ve
·         times the number of independent variables.

There are several purposes of DA:
·         To investigate differences between groups on the basis of the attributes of the cases,
·         indicating which attributes contribute most to group separation. The descriptive technique successively identifi es the linear combination of attributes known as canonical
·         discriminant functions (equations) which contribute maximally to group separation.
·         Predictive DA addresses the question of how to assign new cases to groups. The DA
·         function uses a person’s scores on the predictor variables to predict the category to
·         which the individual belongs.
·         To determine the most parsimonious way to distinguish between groups.
·         To classify cases into groups. Statistical signifi cance tests using chi square enable you
·         to see how well the function separates the groups.
·         To test theory whether cases are classifi ed as predicted.


Steps to be followed

1 Click Analyze >> Classify >> Discriminant.
2 Select grouping variable and transfer to Grouping Variable box. Then click Defi ne
Range button and enter the lowest and highest codes for your grouping variable defi ne
range.
3 Click Continue then select predictors and enter into Independents box. Then click on
Use Stepwise Methods. This is the important difference from the previous example
(Fig. 25.12).
4  Statistics >> Means, Univariate Anovas, Box’s M, Unstandardized and Within Groups
Correlation.
5 Click Classify. Select Compute From Group Sizes, Summary Table, Leave One Out
Classifi cation, Within Groups, and all Plots.
6  Continue >> Save and select Predicted Group Membership and Discriminant Scores.
7  OK.


- By Rachit
Team F

No comments:

Post a Comment