The lecture started with the introduction of SPSS
tool. SPSS, i.e. Statistical Package and Social Sciences developed by Norman H.
Nie and C. Hadlai Hull is a popular analytics tool used by market researchers,
health researchers, survey companies, government, education researchers. For
the classroom practice, we use the Windows evaluation version15.0 of SPSS software.
The software supports descriptive statistics, bivariate statistics, prediction
for numerical outcomes and prediction for identifying groups.
SPSS contains a data view (where we can feed the data)
and a variable view (where we can give names to the variables). The particulars
entered in the variable column in data view are called ‘cases’. The ‘type’ of
‘Name’ entered in variable view supports many types. Popular among them are ‘comma’
(e.g. Rs. 1,00,00.00) and ‘dot’ (Rs.1.00.00,00)which is widely used in European
countries. The string is a group of alpha-numeric characters. We cannot change
string to numeric just by clicking; we have to code it in different way.
Labels give the description about the names. Labels
are used in the output description as by seeing the labels, we can know what
does the variable mean.
Value variables are called as category variables and
are used for first level analysis. Continuous and discrete variables are used
for second level analysis. Most of the methods require the use of continuous
variables.
If any respondent does not give the data, the
after-analysis is useful as to the reason why he/she has not provided the data.
Knowing this, we know whether we have to rephrase our question or the question
should not have been asked in the first place.
Three types of measures are used: Nominal, ordinal and
scale. Nominal numbers are just for name-sake and these numbers do not have any
intrinsic information in them. Ordinal and scale numbers contain information in
them. Ordinal numbers are ordered according to certain pattern and the scale
numbers (which are further classified into interval and ratio) can also tell
you by how much they differ.
Analysis of the data is classified into univariate,
bivariate and multi-variate analysis. Corresponding to this, there are
different graphs. For example, bubble graphs and radial graphs are used for
multi-variate analysis.
Frequency and cumulative frequency analysis is used to
categorize the values of the variables in order to analyze them. For example,
if 50% of the respondents as seen from cumulative frequency of the ‘age when
people get first married’ fall under 21 years, for analysis purpose, we can set
the ‘early first marriage’ age as less than or equal to 21 and ‘late first
marriage’ age as age greater than 21.
To find whether there is a relation between two
variables or not, we first set the null hypothesis considering there is no
relation between these variables. And we examine the validity of the null
hypothesis by keeping the variable compared in the ‘row’ section of the
crosstab. Then we click on the row and column percentage to see if relation
exists.
For this chi-square test, we always test for some
confidence level. Confidence level depends on the criticality of the scenario
concerned. For the routine business decisions, we take the confidence level to
be 95 %( For more critical scenarios, 99% confidence level is a must). i.e., if
significance value is less than 0.05, then there is a significant difference
between the two variables under examination;
and in this case, we reject the Null Hypothesis and accept the alternate
hypothesis.
By
Unmesh Ramesh Kulkarni
Parveen Rathee
By
Unmesh Ramesh Kulkarni
Parveen Rathee
No comments:
Post a Comment