DAY 3_TEAM I_ (PRISKILINA BASUMATARI)
There are 2 type of clustering
1) Hierarchical
2) Non-hierarchical
Hierarchical
clustering is a method of cluster analysis which seeks to build a hierarchy of
clusters. Strategies for hierarchical clustering generally fall into two types:
·
Agglomerative:
This is a "bottom up" approach: each observation starts in its own
cluster, and pairs of clusters are merged as one moves up the hierarchy.
·
Divisive:
This is a "top down" approach: all observations start in one cluster,
and splits are performed recursively as one moves down the hierarchy.
Non-hierarchical cluster analysis forms a grouping of a set of
units, into a pre-determined number of groups, using an iterative algorithm
that optimizes a chosen criterion.
K-means concept:
·
K-mean is used to take multiple
clusters
·
It performs a non-hierarchical
divisive cluster analysis on input data.
·
Have several features that
distinguish it from the more common hierarchical clustering techniques.
·
executes a variance minimizing
non-hierarchical cluster analysis
Dendogram:
It is a branching diagram representing a
hierarchy of categories based on degree of similarity or number of shared characteristics.
It is used to combine the clusters.
OLAP:
online analytical processing:
OLAP cube is the representation of the data in
a meaningful way to study and analyze.
On-Line Analytical Processing (OLAP) is a category of software technology that
enables analysts, managers and executives to gain insight into data through
fast, consistent, interactive access to a wide variety of possible views of
information that has been transformed from raw data to reflect the real
dimensionality of the enterprise as understood by the user.
OLAP
functionality is characterized by dynamic multi-dimensional analysis
of consolidated enterprise data supporting end user analytical and navigational
activities including:
- calculations and modelling applied across
dimensions, through hierarchies and/or across members
- trend analysis over sequential time periods
- slicing- subsets for on-screen viewing
- drill-down - to deeper levels of consolidation
- reach- through- to underlying detail data
- rotation- to new dimensional comparisons in the viewing area
OLAP is
implemented in a multi-user client/server mode and offers consistently rapid
response to queries, regardless of database size and complexity. OLAP helps the
user synthesize enterprise information through comparative, personalized
viewing, as well as through analysis of
historical and projected data in various "what-if" data model
scenarios. This is achieved through use of an OLAP Server.
File Used:
Cell_Inter.sav
OLAP
Cubes
Gender
of respondent: Total
Name
of current service provider: Total
Connection
Type: Total
OLAP Cube
for Overall Cell bills
Sum
|
N
|
Mean
|
Std. Deviation
|
% of Total Sum
|
% of Total N
|
|
Usage
period In Months
|
2569
|
206
|
12.47
|
9.084
|
100.0%
|
100.0%
|
Monthly
expenditure on phone
|
72633.00
|
206
|
352.5874
|
184.64170
|
100.0%
|
100.0%
|
Fixed
component of bill
|
9914.00
|
206
|
48.1262
|
19.59825
|
100.0%
|
100.0%
|
Voice
calls bill
|
9985.00
|
206
|
48.4709
|
28.83031
|
100.0%
|
100.0%
|
SMS bill
|
5519.00
|
206
|
26.7913
|
17.64308
|
100.0%
|
100.0%
|
Other
charges
|
1147.00
|
206
|
5.5680
|
11.18940
|
100.0%
|
100.0%
|
Special
Package
|
995
|
206
|
4.83
|
2.049
|
100.0%
|
100.0%
|
Games
|
234
|
206
|
1.14
|
.344
|
100.0%
|
100.0%
|
Other
|
402
|
206
|
1.95
|
.215
|
100.0%
|
100.0%
|
No comments:
Post a Comment