Wednesday, September 5, 2012

Day-3 Group-J


Day-3   Group-J
TOPIC – OLAP Cube AND Hierarchical Clustering
Today we started with hierarchical clustering.
What is Hierarchical Clustering?
·         Hierarchical clustering is a widely used data analysis tool.
·         The idea is to build a binary tree of the data that successively merges similar groups of points.
·         Visualizing this tree provides a useful summary of the data.
Then we did OLAP Cube.
What is OLAP?
            OLAP stands for Online (OL) Analytical (A) Processing (P) is software that is designed to allow users to navigate, retrieve and present business or organizational data.
What is an OLAP cube?
            The data has a different architecture from other forms of data storage. There are two components – firstly the data built into a cube that houses the multidimensional data and secondly access tools to build and manipulate the data. A cube is a specialized data store designed to handle multidimensional data and the aggregated numerical data.

 The Approach used to get an output using both of these methods is as follows:
Case- Mobile phones
We had a .sav file containing 206 cases and 45 variables. As the number of variables were less than 50, hierarchical clustering is possible.
First we have to go to analyze in variable view and select classify to get to hierarchical cluster,


 After selecting Hierarchical cluster the screen appears as,
Select variables here from the list of variables. Also select cluster type as variable because the number of variables are less than 50.

 After selecting cluster type, go to statistics and select Agglomeration schedule and proximity index and continue.

 Next step is to go in plots and select dendogram and continue,


The last step in hierarchical cluster is to go in methods and select between groups linkage and the measure to be used for analysis. There could be various types of methods like Jaccard method in binary and Euclidian distance method in intervals. Don’t forget to change present absents and power roots for Jaccard and Euclidian method respectively.

 After following all these steps, you will get an output as follows which contains proximity matrix, agglomeration schedule and dendogram for the variables selected as follows,


Same ways to reach OLAP Cube, go to analyze and select reports and you will find OLAP cube in it,


After selecting OLAP cube, screen appears as follows, where you will be having two types of variables one is summary and the other is grouping. Select variables for both from the list of variables and click ok.

 Once you click ok, the output received will be as follows,



No comments:

Post a Comment