transformation and comparisons Flashcards

Question 1

Q

the shape of things

Answer

A

If we measured the height of 1000 women and plotted the values then we might get something like Figure 1.
Most heights are in the 155-175 centimetre range.
The distribution is roughly symmetrical around its mean (165 cm) and it has a shape characteristic of a normal distribution.
Of course the plot in figure 1 doesn’t look exactly like a normal distribution.
But if we measured more and more people (e.g., 100,000 people) then we might get something like Figure 2.
Figure 2 also shows the corresponding normal distribution with a mean of 165 and a standard deviation of 10.
Although the normal distribution is an idealisation, or an abstraction, we can use it to do some very useful things.

Question 2

Q

the standard normal distribution

Answer

A

In lecture 8, I said that two parameters, changed where the normal distribution was centred and how spread out it was.
I said that changing these values didn’t change the relative position of points on the plot. The overall shape remains the same.
All normal distributions have the same overall shape as the standard normal distribution even if they’re centred in a different place and are more or less spread out.
To see what I mean by this, we’ll take out heights of 1000 people, but instead of displaying them in centimetres we’ll display them in metres.
Changing the scale on which you’re measured doesn’t actually change your height relative to other people.
The distribution in Figure 3a has a standard deviation of 10
The distribution in Figure 3b has a standard deviation of 0.1
But as you can see, they’re the same distributions - they-re just displayed on different scales (cm vs m)
Changing the scale changes the standard deviation. This is why the standard deviation is sometimes referred to as the scale parameter for the distribution.
Apart from changing the scale, we can also change where the distribution is centred.
In Figure 4a we can see the same distribution as before. In Figure 4b we can see a distribution is now centred at 0.

Question 3

Q

transformations

Answer

A

in figure 3 and figure 4 we saw that we could transform a variable so that it had a new location (mean) or scale (standard deviation) without changing the shape
these two kinds of transformations are known as centring and scaling

Question 4

Q

centring

Answer

A

to centre a set of measurements, you subtract a fixed value from each observation in the dataset
this has the effect if shifting the distribution of the variable along the x-axis
you can technically centre a variable by subtracting any value from it but the most frequently used method is mean-centring

Question 5

Q

mean centring

Answer

A

mean centring a variable shifts it so that the new mean is at the zero point
the individual values of a mean-centred vairable tell us how far that observation is from the mean of the entire set of measures
it doesnt alter the shape of the distribution, or chaange the scale that it’s measured on
it only changes the interpretation of the values to, for example, differences from the mean

Question 6

Q

scaling

Answer

A

is performed by dividing each observation by some fixed value
this has the effect if stretching or compressing the variable along the x-axis
you can scale a variable by dividing it by any value
but typically scaling is done by dividing values by the standard deviation of the dataset
scaling doesn’t change the fundamental shape of the variables distribution
but after scaling the data by the standard deviation the values would now be measured in units of sd

Question 7

Q

the z transform

Answer

A

the combination of first mean-centring a variable and then scaling it by its standard deviation is known as the z-transform
The 10 values in Table 1 have a mean of 5.7 and a standard deviation of 2.21.
To z transform the data in Table 1, we would do the following steps:
1. We’d subtract 5.7 from each value and put them in the Centred column
2. Then we’d divide each value in Centred by 2.21
We can now interpret the data in terms of distance from the mean in units of standard deviation.
The z transform will come in handy when it comes to making comparisons.

Question 8

Q

comparing groups

Answer

A

in the context of quantitative research we’re often looking at the average difference in a variable between groups
In the Figure 5 we can see measurements from a reaction time task.
- Amateurs sportspeople have a mean reaction time of 500ms and professionals have a mean reaction time of 460ms.
- There is overlap between the two groups, but there is a difference between the averages.
- To quantify the difference, just subtract the mean of one group from the mean of the other.
  - The mean difference is just 500ms - 460ms = 40ms.

Question 9

Q

comparing across groups

Answer

A

In the previous example the comparisons were easy because the measurements were on the same scale (milliseconds).
But let’s say that you want to compare two children on a puzzle completion task.
- One child is 8 years old, and the other is 14 years old.
- They do slightly different versions of the task and the tasks are scored differently.
Because we have two different tests that might have a different number of items etc we can’t just compare the raw numbers to see which is bigger.
Example:
Lets take two children:
- Ahorangi is 8 years old and scored 86 on the task
- Benjamin is 14 years old and scored 124 on the task
We can easily tell that Benjamin’s score is higher than Ahorangi’s score
But the scores are not directly comparable… so what do we do?
- We have to look at how each performed relative to their age groups.
- Is Ahorangi better performing relative to 8 year olds than Benjamin is relative to 14 year olds?
- To answer this question we can use the z-transformation.
To do the z-transformation we need to know the mean and standard deviation for each age group.
That means, that Ahorangi, despite having a lower score, actually scored very high for an 8 year old.
Benjamin only scored a little higher than the average 14 year old.

Question 10

Q

making comparisons with the sampling distribution

Answer

A

· From last week we learned that the sampling distribution of the mean will be centred at the population mean and have a standard deviation equal to the standard error of the mean.
· But remember, we don’t know the value of the population mean we can generate a hypothesis about what we think the population mean might be…
· Although we don’t know the value of the population mean we can generate a hypothesis about what we think the population mean might be…
· We can generate a hypothetical sampling distribution based on our hypothesised value of the population mean.
· Example:
· Let’s say I get a group of people to perform a task where they have to try and quickly recognise two sets of faces. Either famous faces or faces of their family members.
· I find that the mean difference between these two conditions is 24.87ms
· But this is just the difference in my sample. The population mean difference might be some other value
· Although we don’t know the population mean, we could hypothesise that it is 100 ms, 50 ms, 0 ms, or some other value. Let’s just pick 0 ms for now.
· Now we can generate a sampling distribution using our hypothesised population mean and the standard error of the mean we estimate from the sample (let’s say it’s 8.88)
· In Figure 6 we can see what the sampling distribution would look like if the population mean were 0.
* We can compare our particular sample mean of 24.87ms to the sampling distribution
* Because the sampling distribution is a normal distribution we know that ~68% of the time the sample means will fall between ±1 SEM of the population mean (-8.88ms to 8.88ms)
- And ~95% of the time sample means will fall between -17.76ms and 17.76ms.
* For our particular mean we see that it falls 2.8 SEM from our hypothesised population mean
* What can we make of this?
* We can conclude that if the population mean were in fact 0 then we have observed something rare
* If the population mean were in fact 0, then it would be rare for a sample mean to be that far away from the population mean
* Observing something rare doesn’t tell us that our hypothesis is wrong
Rare things happen all the time!
* But if we were to run our experiments again and again, and we continued to observe rare events then we would probably have a good reason to update our hypothesis.
- This process of comparing our sample to the sampling distribution is known as null hypothesis significance testing

transformation and comparisons Flashcards

(10 cards)