Monday, September 3, 2012

Day 1 - Team H


SPSS stands for Statistical package for social science. SPSS is among the most widely used programs for statistical analysis in social science. It is used by market researchers, health researchers, survey companies, government, education researchers, marketing organizations and others.
The features of the software include:  statistical analysis, data management and data documentation
Statistics included in the SPSS software:
·         Descriptive statistics: Cross tabulation, Frequencies, Descriptives, Explore, Descriptive Ratio Statistics
·         Bivariate statistics: Means, t-test, ANOVA, Correlation (bivariate, partial, distances), Nonparametric tests
·         Prediction for numerical outcomes: Linear regression
·         Prediction for identifying groups: Factor analysis, cluster analysis (two-step, K-means, hierarchical), Discriminant
The above statistical tools provide various options for enquiring the relationship between the variables. We started our Business Analytics learning with Usage of some of these tools.
To perform any analysis using SPSS the data needs either to be feed, or to be exported to a SPSS editor. The SPSS editor has two sheets namely data view, and Variable view. The data view sheet contains all the data captured against corresponding variables (Each column in a data view sheet represents a distinct variable), while the variable view sheet summarizes the characteristics of the various variables present in the data sheet.
The column labels of the variable view sheet are fixed, and the labels are:
·         Name: Represents name of the variable. E.g. - Age, Gender, etc.
·         Type: Contains the information about the type of variables. The various types include Numeric, Comma, Dot, Scientific Notation, Date, Dollar, Custom Currency and String (Alphanumeric).
·         Width: Represents the permissible size of the corresponding column. The default width is the width of the first data entered in any column of the data view sheet.
·         Decimals: Represents the decimal places up till which data is expected.
·         Label: Contains the detailed description of the variable.
·         Values: Variables are of two types, namely category, and continuous. The continuous variables are further classified as continuous, and descriptive.  In case of category variable we assign a numeric value to each of the category. This column contains information about all such assignments.
·         Missing: This column contains the information about the number of missing response against a particular variable.
·         Column: Represents the visible width of any data. For ex. Consider the width of name variable is 5, and the name entered against that variable is ABCDE, if the column has the value 3 then the data visible to us would be ABC.
·         Alignment: Shows the various alignment options for the data.
·         Measure: Shows the type of data i.e. scale, ordinal or nominal.
Nominal data are the data which doesn’t have any information. E.g. - name, location, etc. The order of these data doesn’t convey any information, ordinal data contains the information about the order of the data .ie, how a, b, c, d, e should be ranked based on magnitude or some characteristics, but it doesn’t reveal the magnitude of difference between two data points, and Scale data are the data which contain the order and magnitude information i.e. If a, b are two data points on a scale then we have the information that whether a>b or b>a, and by how much.
To start any research project we need to develop hypothesis to be verified. Hypothesis is of two types, namely:
a) Null Hypothesis: The null hypothesis advocates nonexistence of the relationship between the variables
b) Alternate Hypothesis: Accepts the existence relationship between the variables.
The second step is verification of the established hypothesis based on the available data. We verified some of the hypothesis developed in the class using SPSS tools.
The various tools used in the last class were:
·         Descriptive statistics: Frequency, Crosstab( Chi-square, Percentage representation of data)
·         Transformation: Reduces continuous variable to category variable.
We shall continue exploring SPSS in further classes and keep on enriching this blog space.

Authors:
Nishant Lal
Manish Kumar Lath

No comments:

Post a Comment