Descriptive Statistics in Psychology


Statistics Introduction

Descriptive statistics simply describe what you have found when carrying out an investigation. They make no attempt to go beyond the data contained, and they do not explain what caused the result. They are simply a means of presenting the data in the clearest way possible.

"Statistics: The only science that enables different experts using the same figures to draw different conclusions".

— Evan Esar (1899-1995), American Humorist


Measures of Central Tendency (Averages)

You will be familiar with the idea of an average. There are, in fact, three different types of average, known as 'measures of central tendency'.

The Mean

To calculate the mean all values in a set of data are added together and divided by the number of values (N). Used with normal distribution and interval level data. Not suitable for use where extreme values can distort the mean. The mean is valuable to the psychologist because it takes all the data into account and it can be used in further statistical analyses. The advantage of using the mean is that it takes into account s all the data (i.e. very sensitive). The disadvantage is that it can be distorted by extreme values.

The Median

The median is the halfway point that separates the lower 50% of scores from the higher 50%. All values are arranged in order, the middle value is the median. Used with interval or ordinal level data, the median is not affected much by extreme values.

The median is especially useful when there are a few extremely high or extremely low scores which can give a misleading average score. E.g. six scores on a test out of 100 are; 70,74,75,77,78,100. The mean, or average score, is 79 but this is misleading in the sense that only one of the six Ps has scored this high. The median score of 76 is a better description of the data. The advantage of the median is that it is not distorted by extreme values (e.g. 2, 3, 4, 4, 5, 6, 6, 6, 6, 6, 45, and 65) However, it can be distorted by small samples and is less sensitive.

The Mode

The mode is the score that occurs the most often, i.e. the most popular score. E.g. with scores of 30, 30, 30, 50, 96, 100 the mean average is 61 which is misleading in the sense that no-one scored anywhere near this; the median is 40, which again does not approximate to anyone's score and the mode is 30, which at least lets us know that more people obtained this score than any other score.

The mode is useful in certain instances where other measures of central tendency are rather meaningless. For instance, if you are a buyer for a shop whose target population consists of 50% of people who wear size 12 clothes and the remaining 50% are size 16 then it is no use you ordering size 14 clothes just because this is the average size! The advantage of using the mode is that it is not affected by extreme scores, and is useful to show the most popular value. However, it can be a crude measurement and is not useful if there are many modal scores.


Measures of Dispersion

These measures indicate whether the scores in a given condition are similar to each other or whether they are spread out. There are two measures you need to know:

Standard Deviation

Definition: Standard Deviation is the measure which tells us how scores are spread around the mean. In-other-words how much the scores “deviate” from the mean.

There are 2 formulae for calculating the standard deviation:

standard deviation formulae

They may look pretty complicated, but in fact are pretty easy to use, especially the second one which is the more widely used.

Example:

To find the standard deviation of this set of numbers:

3, 5, 5, 7, 9

We will need the mean to calculate the standard deviation where:

standard deviation formulae

standard deviation formulae

The smaller the standard deviation (e.g. 2.1), the closer together the values in the set are. A larger standard deviation (e.g. 5.1) shows that the values are more spread out from the mean.

The advantage is that all scores in the set are taken into account so it is more accurate than the range and it can also be used in further analysis / calculations. The disadvantage is it is not as quick or easy to calculate than other measures of dispersion, such as the range (i.e. time consuming).

The Range

The range is a basic measure of the spread of a set of scores, and can be defined as the difference between the highest and the lowest score in any condition. In order to calculate the range you need to put your data into numerical order (e.g. lowest to the highest score).

The advantage of the range is that it is easy to calculate and that it shows the extreme values. The disadvantages are that it is distorted by extreme scores and that it gives no information as to whether scores are clustered around the mean or evenly spread out. For example, the ranges of 1, 7, 7, 8, 9, 9, 17 and 1, 3, 5, 7, 9, 11, 13, 15, 17 are exactly the same.

Statistics PDF Downloads

Descriptive Statistics

Levels of Measurement


Back to Homepage

Site Map