Non-Hierarchical Clustering
K-Means- It is
clustering technique in which the number of expected cluster is initially
specified to obtain the desired level of clustering. The required number of clusters
can either be obtained through hierarchical clustering.
How to use Dendogram-
* * * * * * H I E R A R C H I C A
L C L U S T E R A N A L Y S I S * * * * * *
Dendrogram using Average Linkage (Between
Groups)
Rescaled Distance
Cluster Combine
C A S E
0 5 10 15 20 25
Label Num
+---------+---------+---------+---------+---------+
billsms 4 òûòø
billothr 5 ò÷ ùòòòòòòòòòòòòòòòòòòòòòòòòòòòòòòòòòòòòòòòòòòòòòø
billfix 2 òûò÷ ó
billtalk 3 ò÷ ó
mntspend 1 òòòòòòòòòòòòòòòòòòòòòòòòòòòòòòòòòòòòòòòòòòòòòòòòò÷
To prepare a
Dendogram we need following inputs-
A)
Name of the variable/Cases
B)
Measure-
a.
Interval-
Eucilidean
b.
Binary- Eucilidean/Jaccard
c.
Count
After giving the above input a Dendogram similar to one
shown in the above figure gets generated and the number of clusters is
calculated by setting a cut-off line at the desired iteration level. Generally,
it is drawn at a point where next item to combine is relatively long distance.
How to calculate Jaccard
distance-
Jaccard distance depends on the number of similar values the
two variables to be analyzed have, for e.g.- If A & B are two variables and
the possible binary values they can have be YES & NO then the Jaccard distance
between A & B can be calculated using the formula:
(Number of YES matches/the total number of response) - NO matches
So, the more similar the variation in the variables is there
the lesser is the Jaccard distance.
How to calculate Eucilidean distance-
There are number of ways to calculate Euclidean distance,
one of them is average method. In average method we calculate the distance
between various points in the two clusters, then all the distances are
averaged, and that distance is the Euclidean distance.
Author-
Kuldeep R.
No comments:
Post a Comment