Sunday, September 16, 2012

Day 10 Team B (2)


Discriminant analysis is used to distinguish distinct sets of observations, and to allocate new observations to previously defined groups. This method is commonly used in biology for classification of animal species, and in medicine for classification of tumor types. It is also used in facial recognition technologies for classifying pixel values, and in the credit and insurance industries for classifying risk.


Discriminant analysis has two main goals:

Discrimination
Construct a classifier to separate the distinct set of observations from all observations in a known population.
Classification
Separate unlabeled observations into labeled groups using a classifier.

For discriminant analysis, Origin provides two different probability settings:

Equal
Proportional to group size

Origin provides two methods for computing discriminant functions:

Linear
Quadratic


The discriminant model has the following assumptions:

Multivariate Normality
Data values are from a normal distribution. We can use a normality test to verify this. However, please note that normal assumptions are usually not "fatal". The resultant significance tests may still be reliable[2]

Equality of variance-covariance within group
The covariance matrix within each group should be equal. Equality Test of Covariance Matrices can be used to verfy it. When in doubt, try re-running the analyses using the Quadratic method, or by adding more observations or excluding one or two groups.

Low multicollinearity of the variables
When high multicollinearity among two or more variables is present, the discriminant function coefficients will not reliably predict group membership. We can use the pooled within-groups correlation matrix to detect multicollinearity. If there are correlation coefficients larger than 0.8, exclude some variables or use Principle Component Analysis first.


No comments:

Post a Comment