Team:D
Cluster Analysis
Cluster analysis is an
exploratory data analysis tool for solving classification problems. Its
object is to sort cases (people, things, events, etc) into groups, or clusters,
so that the degree of association is strong between members of the same cluster
and weak between members of different clusters. Each cluster thus
describes, in terms of the data collected, the class to which its members
belong; and this description may be abstracted through use from the particular
to the general class or type.
Cluster analysis is thus a tool
of discovery. It may reveal associations and structure in data which,
though not previously evident, nevertheless are sensible and useful once
found. The results of cluster analysis may contribute to the definition
of a formal classification scheme, such as a taxonomy for related animals,
insects or plants; or suggest statistical models with which to describe
populations; or indicate rules for assigning new cases to classes for
identification and diagnostic purposes; or provide measures of definition,
size and change in what previously were only broad concepts; or find exemplars
to represent classes.
Cluster analysis is the statistical method of partitioning a
sample into homogeneous classes to produce an operational classification. Such a classification may help:
- Formulate hypotheses concerning
the origin of the sample, e.g. In evolution studies.
- Describe a sample in terms of a
typology, e.g. For market analysis or administrative purposes.
- Predict the future behavior of
population types, e.g. In modeling economic prospects for different
industry sectors.
- Optimize functional processes,
e.g. Business site locations or product design.
- Assist in identification, e.g.
in diagnosing diseases.
- Measure the different effects
of treatments on classes within the population, e.g. With analysis of
variance.
Chi-Square Test
Chi-square is a statistical test commonly
used to compare observed data with data we would expect to obtain according to
a specific hypothesis. For example, if, according to Mendel's laws, you
expected 10 of 20 offspring from a cross to be male and the actual observed
number was 8 males, then you might want to know about the "goodness to
fit" between the observed and expected. Were the deviations (differences
between observed and expected) the result of chance, or were they due to other
factors. How much deviation can occur before you, the investigator, must
conclude that something other than chance is at work, causing the observed to
differ from the expected? The chi-square test is always testing what scientists
call the null hypothesis, which states that there is no significant
difference between the expected and observed result
Chi-Square Formula
References:
en.wikipedia.org/wiki/Cluster_analysis
http://www.clustan.com/what_is_cluster_analysis.html
www.ndsu.edu/pubweb/~mcclean/plsc431/mendel/mendel4.htm
BY:
Abhisek Machama
No comments:
Post a Comment