The class
started with the concept of Factor analysis. Factor analysis is used to reduce
the number of variables by combining the variables which are correlated in one
group.
Factor
analysis is a statistical method used to describe variability among observed, correlated variables in terms of a potentially lower number
of unobserved variables called factors. In other words, it is
possible, for example, that variations in three or four observed variables
mainly reflect the variations in fewer unobserved variables. Factor analysis
searches for such joint variations in response to unobserved latent variables1.
Factor
analysis is used when there are lot of variables and some of the variables are
correlated in some way. We cannot take string and nominal variables while doing
factor analysis. Only scale variables are used.
Factor analysis originated in psychometrics, and is used in
behavioral sciences, social
sciences, marketing, product management, operations research, and other applied
sciences that deal with large quantities of data.2
The
idea is to convert all correlated variables into one measurement unit. For this
purpose, a concept of z-score is used. The ratio of variance to standard
deviation gives the z-score. It has a mean of 0 and a standard deviation of 1.
‘Extraction’
is nothing but the commonality between two variables. It should be greater than
0.5 to be considered as a good correlation.
In the
scree plot, the number of components contributing to the large drop in the
graph is taken for final analysis.
‘Rotated
component matrix’ is the critical element of factor analysis. Rotation tries to
equalize the variance and come up with the dominant variables. Here, it should
be noted that cumulative variation remains the same.
Applications of Factor Analysis
1. Identification of Underlying Factors:
– clusters variables into homogeneous sets
– creates new variables (i.e. factors)
– allows us to gain insight to categories
2. Screening of Variables:
– identifies groupings to allow us to select
one
variable to represent many
– useful in regression (recall collinearity)
Reference- http://en.wikipedia.org/wiki/Factor_analysis
Some Examples of Factor-Analysis Problems
3. Suppose many species
of animal (rats, mice, birds, frogs, etc.) are trained that food will appear at
a certain spot whenever a noise--any kind of noise--comes from that spot. You
could then tell whether they could detect a particular sound by seeing whether
they turn in that direction when the sound appears. Then if you studied many
sounds and many species, you might want to know on how many different
dimensions of hearing acuity the species vary. One hypothesis would be that
they vary on just three dimensions--the ability to detect high-frequency sounds,
ability to detect low-frequency sounds, and ability to detect intermediate
sounds. On the other hand, species might differ in their auditory capabilities
on more than just these three dimensions. For instance, some species might be
better at detecting sharp click-like sounds while others are better at
detecting continuous hiss-like sounds.
No comments:
Post a Comment