9.2 Describing data
Since it is not possible to collect data from the entire population, from everybody, a subset or sample of the population must be made. Generally, the larger the sample the more representative it will be of the population. It is impossible to have zero bias in a sample from a population as no two samples will be exactly the same. However, researchers must take care to ensure that there is minimal bias in sampling from the population.
One of the common methods for organising population data is to construct a histogram or frequency distribution. A frequency distribution is an organised tabulation or graphical representation of the number of individuals in each category on the scale of measurement.
There are four important ways to describe frequency distributions:
- measures of central tendency (mean, median, mode)
- measures of dispersion (range, variance, standard deviation)
- the extent of symmetry/asymmetry (skewness)
- the flatness or ‘peakedness’ of the distribution (kurtosis).
The frequency distribution can be used statistically to help determine a reference range for the data or reference ranges can also be determined non-statistically. The non-statistical approach relies on reference intervals for an analyte being determined by a consensus of medical experts based on the results of clinical outcome studies. (see Determining what is normal)
Sometimes other factors can influence the frequency distribution of an analyte and then the reference range used for a patient must also consider that factor (e.g. age, sex).