Wednesday, September 5, 2012

Day 3- Team F (Shishir)


The day started with another practical example, the data given data was about the usage of different features of mobile phone such as (SMS, Games, and Alarm).
We used hierarchical clustering get the initial interpretation of the given data. Strategies for hierarchical clustering generally fall into two types:
  • Agglomerative: This is a "bottom up" approach: each observation starts in its own cluster, and pairs of clusters are merged as one moves up the hierarchy.
  • Divisive: This is a "top down" approach: all observations start in one cluster, and splits are performed recursively as one moves down the hierarchy.
The critical element of hierarchical clustering is the dendrogram, The results of hierarchical clustering are usually presented in a dendrogram. It also helps us decide how many clusters are there in the clustering.
In general we use hierarchical clustering when the numbers of elements to be grouped are < 50. If there are large number of elements than it will result in a huge dendrogram, which is very difficult to analyze. In the given data we have 206 cases and 45 variables, so we use variables for clustering.

So far we have only looked at clustering cases. We have also only looked at dealing with interval data. So, while we have a brief look at clustering variables, we also use a different kind of data, namely binary data. Just a look at a table which show if one uses a feature or not does not convey much of impression of how the variables may be related, if they are. 

To see is cluster analysis some light on this we using distance measures on this binary data. There are two methods the “JACCARD” and “EUCLIDEAN”.

An OLAP cube is an array of data that is understood in terms of its 0 or more dimensions. OLAP is an acronym for online analytical processing. Online analytical processing (OLAP) is a technique for quickly analyzing a measure, e.g. profit margin, by multiple categories or dimensions, e.g. customer, region, fiscal period and product line.  Typically the end user software has capabilities to drag categories to rows and columns and aggregate the measure at each intersection of a row and column (often called a cross tab report).  

This is similar to the familiar spreadsheet format.  This numeric format can usually also be represented in the form of a chart or graph.  The real power of OLAP is the ability to drill down on a category to see more details.  For example, you might drill down on a state to see details by city. So here we can use the OLAP cube various things such as:

·         Which age group spends more on monthly basis;
·         If educated people spend more than uneducated on bills
·         Which gender spends more on monthly basis etc. 
        
      Some via various combination and calibrations(known as story telling) we drive a relation and develop marketing mix, which is the real aim of making all this research analysis.

     BY - Shishir Borkar
             Team F



No comments:

Post a Comment