Thursday, September 13, 2012

Day 6 - Team H


PERMAP
The fundamental purpose of Permap is to uncover hidden structure that might be residing in a complex data set.  Compared to other data mining and data analysis techniques MDS is growing increasingly popular because its mathematical basis is easier to understand and its results are easier to interpret (Fitzgerald & Hubert, 1987).
Permap is an interactive computer program.  It offers both metric (ratio and interval) and nonmetric (ordinal, ratio + bounds, interval + bounds) MDS techniques.  It solves problems in up to eight dimensional space and allows boundary conditions to be imposed on the solution.  In the technical jargon, Permap treats "weighted, incomplete, one-mode, two-way" or "weighted,  incomplete, two-mode, two-way" data sets.  Other jargon would say it handles weighted,  symmetric, incomplete, triangular or rectangular data sets.  The word “weighted” means each data point can have its own multiplier that reflects in some way the importance or reliability of the point.  The word “symmetric” means that Permap assumes that the (i, j) proximity value equals the (j, i) proximity value, and “incomplete” means that it can handle missing data.  The one-mode, two-way and square references indicate that Permap can analyze a matrix of proximity information between several objects, and the two-mode, two-way and rectangular references means it can analyze objects each of which are specified by an array of attributes. Permap can treat up to 1000 objects at a time (but see cautions in Section 11) and each object can have up to 100 attributes.  It is easy to use, Windows PC-based, visually oriented, and allows real-time interaction with the analysis.  It has been designed to have an intuitive interface and it avoids many of the arcane alternatives that are seen in the research literature but are never used in practice. 
The following provides an example of a working data file.  To be readable by Permap this file must be stored in an unformatted text format.  Therefore, if you want to actually run this data, copy and paste it into a word processor and then save the file in ASCII or ANSI format.  Be sure that you do not introduce any "strange" characters into the text file (sometimes WordPerfect will add an invisible termination character at the end of a file, and this can cause trouble).  All lines that do not start with a keyword or a number are comment lines.  Comments are disregarded by Permap. This example uses data from Kaufman and Rousseeuw's book "Finding Groups in Data" that gives the subjective dissimilarities between eleven sciences as seen by fourteen postgraduate economics students from several different countries.

MESSAGE=Differences Between the Sciences
NOBJECTS=11
DISSIMILARITYLIST
Astr,  0
Biol,  7.87,  0
Chem,  6.50,  2.93,  0
CSci,  5.00,  6.86,  6.50,  0
Econ,  8.00,  8.14,  8.21,  4.79,  0
Geog,  4.29,  7.00,  7.64,  7.71,  5.93,  0
Hist,  8.07,  8.14,  8.71,  8.57,  5.86,  3.86,  0
Math,  3.64,  7.14,  4.43,  1.43,  3.57,  7.07,  9.07,  0
Medi,  8.21,  2.50,  2.93,  6.36,  8.43,  7.86,  8.43,  6.29,  0
Phys,  2.71,  5.21,  4.57,  4.21,  8.36,  7.29,  8.64,  2.21,  5.07,  0
Psyc,  9.36,  5.57,  7.29,  7.21,  6.86,  8.29,  7.64,  8.71,  3.79,  8.64,  0

The data can be separated with space(s), a comma, or both.
If there are missing data, they should be entered as "NA" or "na."
The leading names shown above are optional, but if used they must start with a letter, must not start with NA or na, and must not contain non-alphanumeric characters.

Author
Kuldeep Chordia

No comments:

Post a Comment