Monday, September 3, 2012

Day 1 - Team D




Business Analytics (BA) is the practice of iterative, methodical exploration of an organization’s data with emphasis on statistical analysis.  Business analytics is used by companies committed to data-driven decision making. 

There are few examples of BA uses:
·         Exploring data to find new patterns and relationships (data mining)
·         Explaining why a certain result occurred (statistical analysis, quantitative analysis)
·         Experimenting to test previous decisions (A/B testing, multivariate testing)
·         Forecasting future results (predictive modellingpredictive analytics)

Now we need to know some basics and techniques to get a better grip on the subject. We use simple software called SPSS (Statistical package for the social sciences) to simplify the statistical and quantitative analysis.It provides deep analysis of qualitative text (survey responses to open-ended questions). It converts unstructured data into structured data; find hidden patterns, sentiments and so on.
In order to use the software in an efficient manner, we will focus on some basics.

Variable:
We use variables to define the value of our collected data. There are primarily two kinds of variables
1.      Category variables:
They generally contain finite number of values.
Ex: If we take gender as a variable and if we take value 1 for male and 2 for female, the number of values can only be two. So it is a category variable.

2.      Continuous Variables:
They generally contain infinite number of values.
Ex: If we take age as a variable, the number of values is infinite. So it is a continuous variable.

Now Continuous variable is divided into two parts.
                               I.            Continuous: Values can be in fraction
                            II.            Discrete: Values are only integers.

Scale:
There are mainly three kinds of scales for measurement.
1.      Nominal:
A variable can be treated as nominal when its values represent categories with no intrinsic ranking; for example, the department of the company in which an employee works. Examples of nominal variables include region, zip code, or religious affiliation.
2.      Ordinal:
A variable can be treated as ordinal when its values represent categories with some intrinsic ranking; for example, levels of service satisfaction from highly dissatisfied to highly satisfy. Examples of ordinal variables include attitude scores representing degree of satisfaction or confidence and preference rating scores.
3.      Scale:
A variable can be treated as scale when its values represent ordered categories with a meaningful metric, so that distance comparisons between values are appropriate. Examples of scale variables include age in years and income in thousands of dollars.

Now, we will proceed further with cross tabulation of the collected data. We take two different variables for cross tabulation.
Let us take an example to understand the process clearly.
Suppose from a table we chose two variables:  year of first marriage and sex of the respondents. With the help of these two variables we will do a cross tabulation and we will try to form our hypothesis.

Hypothesis
H1: Females get first married at an earlier age than males
H0: There is no relation between the respondent’s sex and the age of first marriage
Now from the cross tabulation, we can get necessary data to prove our hypothesis.
Let us assume some imaginary data for argument’s sake.
33.7 %( 166 out of 492) of the males get married first at the age of 21 where 59.3 %( 421 out of 710) of the females get married first at the age of 21.
So we can clearly see, there is a relation between the respondent’s sex and the age of first marriage.
Now to make our point more valid, we will go for test of independence which is Chi-Square test.

Chi-Square test
An important question to answer in any genetic experiment is how we can decide if our data fits any of the Mendelian ratios we have discussed. A statistical test that can test out ratios is the Chi-Square or Goodness of Fit test.

Now according to the Chi-Square test, if the significant value is lower than .05, there is a significant difference in our ratios. So we will reject the null hypothesis. In that case, we accept our hypothesis: Females get first married at an earlier age than males.
                                                                                                           By:
                                                                                                         -Sreeparna Mondal
                                                                                                           -Vinod Joshi

No comments:

Post a Comment