Sunday, September 16, 2012

Day 10 - Team D

                                     Application of Descriptive Statistics and Eigenvalues

Descriptive statistics include the numbers, tables, charts, and graphs used to, organize, summarize, and present raw data.   Descriptive statistics are most often used to examine:

  1. ·         Central tendency (location) of data, i.e. where data tend to fall, as measured by the mean, median, and   mode.
  1. ·         Dispersion (variability) of data, i.e. how spread out data is, as measured by the variance and its square root, the standard deviation.
  1. ·         Skew (symmetry) of data, i.e. how concentrated data are at the low or high end of the scale, as measured by the skew index.
  1. ·         Kurtosis (peakedness) of data, i.e. how concentrated data are around a single value, as measured by the kurtosis index
    Descriptive statistics can: (advantages):
  • ·         be essential for arranging and displaying data
  •       form the basis of rigorous data analysis
  • ·         be much easier to work with, interpret, and discuss than raw data
  • ·         help examine the tendencies, spread, normality, and reliability of a data set
  • ·         be rendered both graphically and numerically
  • ·          include useful techniques for summarizing data in visual form
  • ·          form the basis for more advanced statistical methods
     Descriptive statistics can: (disadvantages)
  • be misused, misinterpreted, and incomplete
  • ·         be of limited use when samples and populations are small
  • ·         demand a fair amount of calculation and explanation
  • ·         fail to fully specify the extent to which non-normal data are a problem
  • ·          offer little information about causes and effects
  • ·          be dangerous if not analysed completely
Any description of a data set should include examination of the above.  As a rule, looking at central tendency via the mean, median, and mode and dispersion via the variance or standard deviation is not sufficient. Descriptive statistics are recommended when the objective is to describe and discuss a data set more generally and conveniently than would be possible using raw data alone.  They are routinely used in reports which contain a significant amount of qualitative or quantitative data. 

Descriptive statistics help summarize and support assertions of fact. Note that a thorough understanding of descriptive statistics is essential for the appropriate and effective use of all normative and cause-and-effect statistical techniques, including hypothesis testing, correlation, and regression analysis.

Unless descriptive statistics are fully grasped, data can be easily misunderstood and, thereby, misrepresented. All four moments should be explored whenever possible.  Skew and kurtosis should be examined any time you deal with interval data since they jointly help determine whether the variable underlying a frequency distribution is normally distributed.  Since normal distribution is a key assumption behind most statistical techniques, the skew and kurtosis of any interval data set must be analysed.  Data that show significant variation skew, or kurtosis should not be used in making inferences, drawing conclusions, or espousing recommendations.

Eigenvalues (latent values): In multivariate statistics, eigenvalues give the variance of a linear function of the variables. Eigenvalues measure the amount of the variation explained by each principal component (PC) and will be largest for the first PC and smaller for the subsequent PCs. An eigenvalue greater than 1 indicates that PCs account for more variance than accounted by one of the original variables in standardized data. This is commonly used as a cut-off point for which PCs are retained.

One most important statistical application in which eigenvalues of the covariance matrix play a key role is Principal Component Analysis (PCA). It is a linear dimensionality reduction procedure, which can also be thought of as a model selection technique.



No comments:

Post a Comment