Monday, September 10, 2012

Day 4 Team C (Rahat Dhir)


Team C – Day 4
We began our sessions with understanding the importance and power of 2nd degree & 1st degree analysis using frequency & crosstab. Both these tools gave us the power to mine the available data and play around with assumptions. The underlining objective being creation of a story and supporting claims with 1st degree tools available to us by build around joining block by block. It gives us the power to think beyond the obvious. Hidden challenges, co relations etc are seen in an entirely new fashion.
We personally feel the significance of today’s lecture to be great which can be employed in unorthodox fields like astrology, astronomy, defence etc to see forecast what will happen. Even the predictability of errors and failure occurring can be traced and hence removed.     
Crux of our today’s sessions of BA makes us feel the limitless boundaries to be explored around us. It tells us how an obvious looking survey data may have business changing and fortune changing abilities hidden in them. Now we shall see the chronological flow of the sessions.
We started today with a brief about hierarchal clustering and went to K-means which has to be initiated by interval variables. We took Cell_Inter file and started analysing it. Firstly we needed to set the objective to start with, hence we chose revenue and features provided as the premise to explore further in 2 sessions.
Analyse -> Classify-> K-means cluster analysis.
We then take the 5 scale variables to be explored and raised cluster levels from 3 to 5 till we obtain a significant level of clusters to identify them as clusters. Outliers are then identified to work upon via:
Graphs-> Legacy dialogs-> Boxplot


Hence, detection of outliers (39) is obtained and finally removed from the clusters. Dark horizontal line signifies the median of the values in graph.
Further clustering of the sample is done in 3 clusters. After which the profile of the 3 clusters is formed using 1st level analysis i.e. frequency.
Analyse-> classify-> K-means-> save-> cluster membership
Followed by:
Data-> Split files-> compare groups
This provides us with an option to form 3 virtual internal files to further compare and analyse.
Further we analyse our 2nd objective of features provided. Here, we use:
Analyse -> Classify-> K-means cluster analysis.
And select funuse 0-9 and select 3 clusters, hence obtaining all significant clusters. We further classify them as a normal category (sms, alarm, scheduler etc), everything and nothing category followed by story formation regarding people preference and reasons behind them backed by first level analysis. Therefore finishing our sessions with yet another dimension of how to explore and solve queries hidden in data sheets.

Rahat S. Dhir

No comments:

Post a Comment