Friday, September 14, 2012

Day 8 Team C Factor Analysis

Day 8 Business Analytics
Afternoon lecture:12 pm to 1.15pm,2.15 to 3.30pm


Factor Analysis
It is used to reduce number of variables

For e.g.: a sale of company is dependent on customer satisfaction, market segmentation, product, sales force, competition, demand.we can combine few them to form groups of two as following:
1. Satisfaction & Market segmentation.
2. Product & sales force.
3. Competition & demand.
We find the correlation between the combined groups. The groups are called as factors. We look for common factors and label these factors.


Applications of Factor Analysis
1. Identification of Underlying Factors:
– clusters variables into homogeneous sets.
– creates new variables (i.e. factors)
– allows us to gain insight to categories.
2. Screening of Variables:
– identifies groupings to allow us to select one
variable to represent many.
– useful in regression (recall collinearity)
– Allows us to describe many variables using a few
factors.
4. Sampling of variables:
– helps select small group of variables of
representative variables from larger set.
5. Clustering of objects:
– Helps us to put objects (people) into categories
depending on their factor score.

We took example of car sales and calculated average, standard deviation & Z score as shown below:



Initially, value of 0.5 & Eigen value >1 was kept as cut off for overlapping (50%)
A component matrix of car sales was created as below:



Component Matrix(a)

Component
1
2
Extraction from 1
4-year resale value
0.558
0.771
0.311862
0.594347
0.90620969
Price in thousands
0.681
0.683
0.463963
0.466705
0.93066807
Engine size
0.881
0.169
0.776274
0.028447
0.80472092
Horsepower
0.808
0.476
0.65359
0.226449
0.88003938
Wheelbase
0.652
-0.642
0.424682
0.41269
0.83737201
Width
0.800
-0.345
0.639429
0.118974
0.7584031
Length
0.712
-0.525
0.506776
0.275943
0.78271953
Curb weight
0.916
-0.175
0.839811
0.03055
0.87036126
Fuel capacity
0.839
-0.215
0.703106
0.046142
0.74924766
Fuel efficiency
-0.839
0.024
0.704411
0.000595
0.70500548
Extraction Method: Principal Component Analysis.
6.023905
2.200842
8.22474711
a
2 components extracted.

Components were renamed as specs & values, we got following rotated component matrix

specs
Value
4-year resale value
-0.035
0.951
Price in thousands
0.115
0.958
Engine size
0.590
0.676
Horsepower
0.343
0.873
Wheelbase
0.909
-0.104
Width
0.842
0.221
Length
0.884
0.025
Curb weight
0.829
0.427
Fuel capacity
0.793
0.348
Fuel efficiency
-0.676
-0.498

Coloured part means that there is more correlation and common thing it is called extraction. Higher the value higher is the common thing.
We plot these components in a rotated space


Plot value vs spec on x-y axis as shown



We took another file gss93 subset file and followed same procedure as above calculated z score, scree plotted, reduced variable, renamed the components once the components were formed & then formed rotated component matrix as shown below.

Rotated Component Matrix(a)

Component
Traditional
soft
country
noise
Bigband Music
0.597
0.340
0.206
-0.189
Bluegrass Music
0.164
0.137
0.813
0.018
Country Western Music
-0.074
-0.045
0.825
-0.058
Blues or R & B Music
0.133
0.850
0.143
0.105
Broadway Musicals
0.764
0.190
0.033
-0.091
Classical Music
0.841
0.097
-0.072
0.046
Folk Music
0.604
-0.040
0.463
-0.012
Jazz Music
0.204
0.843
-0.086
0.099
Opera
0.785
0.090
0.006
0.103
Rap Music
0.020
0.142
-0.027
0.793
Heavy Metal Music
-0.044
0.018
-0.012
0.822
Extraction Method: Principal Component Analysis.
 Rotation Method: Varimax with Kaiser Normalization.
a

Rotation converged in 5 iterations.

Then we plotted scattered plot graph in excel as shown below:



Submitted by:
Rohit Thorat 
Team C

No comments:

Post a Comment