Day-3 Group-J
TOPIC
– OLAP Cube AND Hierarchical Clustering
Today we started with hierarchical
clustering.
What
is Hierarchical Clustering?
·
Hierarchical
clustering is a widely used data analysis tool.
·
The
idea is to build a binary tree of the data that successively merges similar
groups of points.
·
Visualizing
this tree provides a useful summary of the data.
Then we did OLAP Cube.
What
is OLAP?
OLAP
stands for Online (OL) Analytical (A) Processing (P) is software that is
designed to allow users to navigate, retrieve and present business or organizational
data.
What
is an OLAP cube?
The
data has a different architecture from other forms of data storage. There are
two components – firstly the data built into a cube that houses the
multidimensional data and secondly access tools to build and manipulate the
data. A cube is a specialized data store designed to handle multidimensional
data and the aggregated numerical data.
The
Approach used to get an output using both of these methods is as follows:
Case- Mobile phones
We
had a .sav file containing 206 cases and 45 variables. As the number of
variables were less than 50, hierarchical clustering is possible.
First
we have to go to analyze in variable view and select classify to get to
hierarchical cluster,
After selecting Hierarchical cluster the
screen appears as,
Select
variables here from the list of variables. Also select cluster type as variable
because the number of variables are less than 50.
After
selecting cluster type, go to statistics and select Agglomeration schedule and
proximity index and continue.
Next
step is to go in plots and select dendogram and continue,
The
last step in hierarchical cluster is to go in methods and select between groups
linkage and the measure to be used for analysis. There could be various types
of methods like Jaccard method in binary and Euclidian distance method in
intervals. Don’t forget to change present absents and power roots for Jaccard
and Euclidian method respectively.
After
following all these steps, you will get an output as follows which contains proximity
matrix, agglomeration schedule and dendogram for the variables selected as
follows,
Same
ways to reach OLAP Cube, go to analyze and select reports and you will find
OLAP cube in it,
After
selecting OLAP cube, screen appears as follows, where you will be having two
types of variables one is summary and the other is grouping. Select variables
for both from the list of variables and click ok.
Once
you click ok, the output received will be as follows,
No comments:
Post a Comment