Different Types of Data
Generally speaking, data can be classified as qualitative or quantitative, though the distinction is illusory (qualitative data can be represented numerically, and vice versa). Qualitative data contains categorical variables and quantitative data contains numerical variables. Categorical variables come in nominal or ordinal flavours, whereas numerical variables can be discrete or continuous. The type of data tends to determine the level of sophistication one can achieve with their statistical tests.
This chapter answers parts from Section A(c) of the Primary Syllabus; "Describe the different types of data". It has no corresponding topic among the Fellowship exam revision chapters. Among the Primary papers, it is represented only by Viva 1 from the second paper of 2007 and Question 17 from the second paper of 2015. "Any reasonable classification was awarded marks", according to the examiners. Usually, examples are required in such questions, and the author has made some effort to offer some.
Qualitative vs. quantitative data
Qualitative data: defined by some characteristic. An example might be blood group or gender.
Quantitative data: measured on some numerical scale. An example might be heart rate or blood pressure.
Categorical vs numerical variables
Categorical variable: a variable can only have one value from a limited range of values. For example, blood group and gender are forms of categorical data. The values belong to some sort of category, on the basis of a qualitative property. Essentially, "categorical" is a synonym for "qualitative".
Numerical variable: when the variable takes some numerical value. An example might be heart rate or blood pressure.
Nominal vs ordinal data
Nominal data: the range of values is not ordered in any sense, but simply named (hence the nom). Again, blood groups, gender, etc. This is a form of categorical data.
Ordinal data: the range of values is ordered along a scale, e.g. disease staging (advanced, moderate, mild) or degree of pain (severe, moderate, mild, none).
Discrete vs. continuous data
Discrete data: when the variable is restricted to specific defined values. For example, "male" or "female" are categorical discrete data values. Mortality (eg. 20 patients dead at 6 months) is an example of numerical discrete data values. There can be no 20.5 dead patients.
Continuous data: when the variable is unrestricted and can have any value from a potentially infinite range, eg. "blue" and "red" might be the categorical data range but the true value can be any subtle shade of purple. An example of numeric continuous data is weight - i.e. one does not have to be exactly 65 or 70 kg; one may easily be 67.5567kg.
Scales of Measurement
Nominal scale: only an identity; values assigned to variables are merely descriptive. An example is gender.
Ordinal scale: values have both an identity and a magnitude. A familiar ordinal scale is an exam which ranks you first, second or third. You know the rank, but you don't know by how much you failed.
Interval scale: values have identity, magnitude, and equal intervals. An example is temperature (every degree Celsius is the same interval).
Ratio scale: values have identity, magnitude, equal intervals, and a minimum value of zero. An example of this is weight.
Importance of data types
Why do we need to know this? Well: data types determine the sort of statistic tests which are applicable.
- If your measurement scale is nominal or ordinal then you use non-parametricstatistics
- If you are using interval or ratio scales you use parametric statistics.