Business Analytics Workshop SIBM 2011 Marketing

FACTOR ANALYSIS - Team E

SPSS has a procedure that conducts exploratory factor analysis. Factor analysis is a statistical method used to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved variables called factors. In other words, it is possible, for example, that variations in three or four observed variables mainly reflect the variations in fewer unobserved variables. Factor analysis attempts to identify underlying variables, or factors, that explain the pattern of correlations within a set of observed variables. Factor analysis is often used in data reduction to identify a small number of factors that explain most of the variance observed in a much larger number of manifest variables. Factor analysis can also be used to generate hypotheses regarding causal mechanisms or to screen variables for subsequent analysis (for example, to identify co-linearity prior to performing a linear regression analysis).Factor analysis searches for such joint variations in response to unobserved latent variables. Factor analysis is used mostly for data reduction purposes:

– To get a small set of variables (preferably uncorrelated) from a large set of variables

– To create indexes with variables that measure similar things

Type of factor analysis

Exploratory factor analysis (EFA) is used to uncover the underlying structure of a relatively large set of variables. The researcher's a priori assumption is that any indicator may be associated with any factor. This is the most common form of factor analysis. There is no prior theory and one uses factor loadings to intuit the factor structure of the data.

Confirmatory factor analysis (CFA) seeks to determine if the number of factors and the loadings of measured (indicator) variables on them conform to what is expected on the basis of pre-established theory. Indicator variables are selected on the basis of prior theory and factor analysis is used to see if they load as predicted on the expected number of factors. The researcher's a priori assumption is that each factor (the number and labels of which may be specified a priori) is associated with a specified subset of indicator variables. A minimum requirement of confirmatory factor analysis is that one hypothesizes beforehand the number of factors in the model, but usually also the researcher will posit expectations about which variables will load on which factors. The researcher seeks to determine, for instance, if measures created to represent a latent variable really belong together.

Advantages

Both objective and subjective attributes can be used provided the subjective attributes can be converted into scores.
Factor analysis can identify latent dimensions or constructs that direct analysis may not.
It is easy and inexpensive.

Disadvantages

Usefulness depends on the researchers' ability to collect a sufficient set of product attributes. If important attributes are excluded or neglected, the value of the procedure is reduced.
If sets of observed variables are highly similar to each other and distinct from other items, factor analysis will assign a single factor to them. This may obscure factors that represent more interesting relationships.
Naming factors may require knowledge of theory because seemingly dissimilar attributes can correlate strongly for unknown reasons.

Commonalities - This is the proportion of each variable's variance that can be explained by the factors. It can be defined as the sum of squared factor loadings for the variables.

Initial - With principal factor axis factoring, the initial values on the diagonal of the correlation matrix are determined by the squared multiple correlation of the variable with the other variables. For example, if you regressed items 14 through 24 on item 13, the squared multiple correlation coefficient would be .564.

Extraction - The values in this column indicate the proportion of each variable's variance that can be explained by the retained factors. Variables with high values are well represented in the common factor space, while variables with low values are not well represented.

Factor - The initial number of factors is the same as the number of variables used in the factor analysis.

Initial Eigenvalues - Eigenvalues are the variances of the factors.

% of Variance - This column contains the percentage of total variance accounted for by each factor.

Rotation is a method used to simplify interpretation of a factor analysis.

• In principal components, the first factor describes most of variability. Uses “ambiguity” or non-uniqueness of solution to make interpretation more simple

• After choosing number of factors to retain, we want to spread variability more evenly among factors.

• To do this we “rotate” factors:

– redefine factors such that loadings on various factors tend to be very high (-1 or 1) or very low (0)

– Intuitively, it makes sharper distinctions in the meanings of the factors

Extraction

Method – specifies the method of extraction which are principal components, unweighted least squares,

generalized least squares, maximum likelihood, principal axis factoring and image factoring.

Under Analyze, either the Correlation matrix or Covariance matrix may be selected. Non rotated factor solution – prints out the non-rotated pattern matrix Scree plot – plots the eigenvalues in descending order. The number of factors extracted may be based on either Eigenvalues over a set number or a specified Number of Factors.

There available rotations are None, Varimax, Quartimax, Equamax, Direct Oblimin and Promax.

With Direct Oblimin, when Delta equals 0 (the default), solutions are most oblique. As Delta becomes more negative, the factors become less oblique.
With Promax, Kappa specifies the power used in the algorithm. The default for Kappa is 4.
Rotated solution – prints out the rotated pattern matrix and factor transformation matrix for orthogonal rotations. For oblique rotations, the pattern, structure, and factor correlation matrices are displayed.
Loading plot(s) – displays the three-dimensional factor loading plot of the first three factors. For a two-factor solution, a two dimensional plot is shown. The plot is not displayed if only one factor is extracted. Plots display rotated solutions if rotation is requested.

Perhaps the most widely used of these is the Varimax criterion. It seeks the rotated loadings that maximize the variance of the squared loadings for each factor; the goal is to make some of these loadings as large as possible, 14.5 Factor rotations 13 and the rest as small as possible in absolute value. The Varimax method encourages the detection of factors each of which is related to few variables. It discourages the detection of factors influencing all variables.

The Quartimax criterion, on the other hand, seeks to maximize the variance of the squared loadings for each variable, and tends to produce factors with high loadings for all variables.

Save as variables – creates a new variable in the working data file with factor scores derived from the chosen method.

The three methods to obtain factor scores are Regression, Bartlett and Anderson-Rubin.

Display factor score coefficient matrix –shows the coefficients by which variables are multiplied to obtain factor scores. Also shows the correlations between factor scores.

The scree plot graphs the eigenvalue against the factor number.

Scree plot: The Cattell scree test plots the components as the X axis and the corresponding eigenvalues as the Y-axis. As one moves to the right, toward later components, the eigenvalues drop. When the drop ceases and the curve makes an elbow toward less steep decline, Cattell's scree test says to drop all further components after the one starting the elbow.

- - Ankita Kunwar

(Team E)

Business Analytics Workshop SIBM 2011 Marketing

Friday, September 14, 2012

No comments:

Post a Comment