Monday, September 3, 2012

Day 1 - Team G

Business analytics (BA) is the practice of iterative, methodical exploration of an organization’s data with emphasis on statistical analysis.  Business analytics is used by companies committed to data-driven decision making. 


Examples of BA uses include: 
  • Exploring data to find new patterns and relationships (data mining)
  • Explaining why a certain result occurred (statistical analysis, quantitative analysis)
  • Experimenting to test previous decisions (A/B testing, multivariate testing)
  • Forecasting future results (predictive modelingpredictive analytics)
SPSS is one of the primary tools used in the process of Data Analysis. It is a proprietary software developed by IBM and stands for Statistics Package for Social Services.

Some of the basic views in SPSS are:

1. Data View - Here the data is shown in its raw format, just the way it was gathered. It takes inputs from various sources including Excel. The data view can also show you the range of values that a variable can take and the no of responses for each category.

2. Variable view - The Variable View tab is another tab in the Data Editor window in addition to the Data View tab, which was discussed in the last chapter. Again, you can select between the tabs at the bottom left corner of the Data Editor Window. This tab does not show raw data but rather shows information about the variables included in the data set. In fact, after examining the Data View, it may seem a little counter-intuitive to look at the Variable View window because the rows now show variables, not cases.

There are 10 columns total. Each column and its significance for variables is discussed in the table below:
ColumnWhat it Means
NameThis column provides the name of the variable. Older versions of SPSS were limited to 8 character names. New versions of SPSS are not limited to 8 characters, but lengthy descriptions should not be included in the Name. They go in the Label column.
TypeThis column indicates the type of variable that is reflected in this particular row. There are 8 options to choose from: Numeric, Comma, Dot, Scientific notation, Date, Dollar, Custom currency, and String. Most variables beginning users will encounter are either Numeric or String variables. String numbers are text and can only be treated as such. As a result, very few manipulations can be performed on them in SPSS.
WidthThis column indicates the number of spaces available for the variable values.
DecimalsThis column allows you to control the number of characters after the decimal place.
LabelThis column allows you to provide a more extensive description of the variable.
ValuesThis column allows you to provide a key for what the numbers of a numeric variable may represent (e.g., 1=Catholic, 2=Protestant).
MissingThis column allows you to indicate whether there are any missing values in a variable. Values marked as missing are excluded from analyses in SPSS.
ColumnsThis column indicates the total number of columns a variable's values may have.
AlignThis column indicates the alignment of the variable in the Data View.
MeasureThis last column indicates the level of measurement of the variable. There are three from which you can choose: Nominal, Ordinal, and Scale.

We also learnt about the types of scales used in the measurement of these variables. There are primarily 3 types of scales used:

1.  Nominal - It signifies only a designation given to a particular variable. There will be no order or magnitude used in the Nominal scale. The data collected at the nominal scale are sometimes called qualitative data and are sometimes treated as having nothing in common with the quantitative data

2. Ordinal - Rank-ordering data simply puts the data on an ordinal scale. Ordinal measurements describe order, but not relative size or degree of difference between the items measured. In this scale type, the numbers assigned to objects or events represent the rank order (1st, 2nd, 3rd, etc.) of the entities assessed.

3. Scale - The scale type takes its name from the fact that measurement is the estimation of the ratio between a magnitude of a continuous quantity and a unit magnitude of the same kind.


We also learnt how to take the output of various types of analysis in SPSS. Some of the first ones used were the:

1. Frequency tables

To do this first go to analyze>descriptive statistics>frequencies. On the following screen we select the variables we want to include in the frequency table. The output sheet produces the table wherein the frequencies of a particular variable are shown with respect to another one.

2. Cross Tabs

Crosstabs is an SPSS procedure that cross-tabulates two variables, thus displaying their relationship in tabular form. In contrast to Frequencies, which summarizes information about one variable, Crosstabs generates information about bivariate relationships. Crosstabs are usually presented with the independent variable across the top and the dependent along the side. 

We can also use recoding in order to make the data more easy to manipulate. For this we go to transform and click on recoding wherein we specify the old and new values to recode the sheet. This is particularly useful in generating cross tabs.

Team G,
Vivek Bakshi
Malovika Roy

No comments:

Post a Comment