Chi-squared test: investigating fingerprint types | A Big Picture film

Chi-squared test: investigating fingerprint types | A Big Picture film


You’ve seen the scene in every TV police drama: the forensics team dusting a crime scene for fingerprints. Everyone’s fingerprints are slightly different, which makes them incredibly useful
for determining who has touched what. Even though everyone’s fingerprints are slightly different,
they still fall into three main types of pattern – loops, arches and whorls. However, arches are relatively rare,
so we’re going to group them with whorls for this experiment. We’re here to find out whether men and women
have different types of fingerprint pattern. We’re going to take as large a sample
of men and women as we can and see what pattern type they have on their right index finger. Then we’ll perform a chi-squared test
to see whether sex and fingerprint type are independent – or whether knowing one thing tells you anything about the other. Both sex and fingerprint type are categorical variables – you place things into categories,
like your eye colour or where you were born. Things like your height and weight
are continuous variables – things that you measure. Numerical variables, such as the number of
siblings you have, are somewhere in between. Sometimes they can work like categorical variables,
or sometimes they can work like continuous variables. You can have two siblings, but not two and a half! One of the tests we can use to see whether
two categorical variables are related to one another is called a chi-squared test of association. It can’t tell us the answer for certain,
but it can tell us how unlikely it would be to get similar observations by chance alone. Now, the first thing we do with any
statistical test is set up the hypotheses – the conclusions we can draw from the experiment. There are two for the chi-squared test. The first – the null hypothesis (and our default answer)
– is that the variables are independent. In this case, the null hypothesis
is that knowing someone’s sex doesn’t tell us anything about how likely they are
to have a particular fingerprint type, and vice-versa. This is the conclusion we’ll draw unless we have strong evidence to the contrary. The second hypothesis is the alternate hypothesis. This is the other possibility – that men and women do tend
to have different fingerprint types. In this case, we would say that there is an association
between fingerprint type and a person’s sex. The next question is: how strong should the evidence be before we reject the idea that the variables are independent? The significance level is a value we choose:
it’s normally 0.05. That means that if the probability is a more
extreme result than the one we end up with is less than 5 per cent, we reject the
null hypothesis and accept the alternate. Otherwise, we say we don’t have enough evidence to reject the null hypothesis. OK. Now let’s get some results! Index finger on the pad! Yeah, I’m a loop. It’s a loop. Second arch of the day. It’s a loop. You have a loop. OK, we’ve collected all our data on fingerprint type. Next, to do a chi-squared test,
we set up a table like this one – a contingency table. We’ve got columns for our two main categories
of fingerprint type, and one for the total. And we’ve got rows for each category of sex
and another one for the total again. Once we’ve recorded our observations,
we put them into the obvious places. In our experiment, 10 of the women
had whorls and arches, and 9 had loops Among the men, 11 had whorls and arches,
and 8 had loops. So, in the total boxes, guess what? We add up the numbers in the corresponding row or column. From our data, it seems like women
might be more likely to have loops than men. But is this due to random chance in our experiment? To find out, we also need to work out the expected value in each cell, which we get from multiplying the number at the end of each row by the number at the bottom of the column and dividing by the total. So for the women with whorls, that would
be 19 times 21 divided by 38, making 10.5. These are the values that would be expected if the relative numbers of fingerprint type were exactly the same for both sexes. They are theoretical values and need not be whole numbers. Then, we work out the chi-squared statistic,
which will tell us what conclusion we can come to. Today, we’ll work it out using the basic formula here although later on you might meet a Yates modification,
which is often used for this two-by-two table. Oh come on, it’s not that bad! You don’t have to remember it
– you can always look it up if you need to. OK, so for each cell, we work out the difference
between the observed value and the expected value. Square it and divide by the expected value. For men with whorls and arches,
the difference is 0.5. We square it to get 0.25, then divide by 10.5 to get 0.024. Once we’ve done that for all the cells,
we add it all up to get our statistic, which is 0.52. We’re not quite done, though. We still need to see how the statistic stacks up,
and for that, we need a look-up table. This example has one degree of freedom
– it’s one less than the number of columns, multiplied by one less than the number of rows. So, we look up the chi-squared table with one degree of freedom, and compare our statistic to the 0.95 value. This is one, minus our significance level of 0.05. If we’d picked a different significance level, we’d use a different value. The number we get for our critical value is 3.841. Our chi-squared statistic is smaller than this, so we’d say, “we do not have significant evidence against
the null hypothesis at the 0.05 level”. We’d conclude that there is not enough evidence to
support an association between fingerprint type and sex. Don’t worry if you can’t reject the null hypothesis, this often happens and is all part of the statistical process. So, to recap, here are the steps required for a chi-squared test. Write down your hypotheses and choose a significance level of 0.05, unless you have a good reason not to. Collect your data and collate them in a table. Work out the expected value for each cell,
and use the formula to figure out the chi-squared statistic. Find the right table for your number of degrees of freedom, and compare your statistic to the appropriate value in the table. Finally, decide whether to reject or accept the null hypothesis, and write down your conclusion! So that’s it. Try it yourself and
let us know how you get on. Don’t forget you can learn more or download our free online resources at wellcome.ac.uk/bigpicture. Bye for now.

Only registered users can comment.

Leave a Reply

Your email address will not be published. Required fields are marked *