00:00
This is the final lesson we will do on
testing.
00:04
The last case we'll examine here is the one
with independent samples and unknown
variances, which are assumed to be equal.
00:12
I'll quickly brush up your memory on the
data set we did and the confidence interval
section. You were trying to see if apples in
New York are more expensive than the
ones in LA.
00:23
You went to ten grocery shops in New York,
and your friend Paul, who lives in
LA, went to eight grocery shops there.
00:31
You got all the prices and put them in a
table.
00:34
And what the population variance of apple
prices is.
00:37
But you assume it should be the same for New
York and L.A.
00:42
Let's state the null and alternative
hypothesis.
00:46
Eight zero mu in New York is equal to mu in
LA or
New York minus mu in LA is equal to zero.
00:57
H one mu in New York is different than mu in
LA
Mu in New York minus moo in LA differs from
zero.
01:07
All right. That's our data set.
01:10
We have also calculated the sample means
standard deviations and sample
sizes. What can we do when the variance is
unknown but
assume to be equal?
Earlier. We use the pooled variance formula.
01:24
Well, here it is again, remember?
All right. It's all about plugging in
numbers, so I'll save you the trouble.
01:36
The pooled variance is 0.05.
01:40
One last thing we need is the standard error
of the difference of means.
01:44
It is given by the following formula.
01:50
I'm going faster than usual, as we've seen
all of this before.
01:53
Moreover, testing is about understanding.
01:56
Computation is routine.
01:58
So let's start testing, shall we?
Small samples, variance unknown.
02:04
Which statistic do we need?
Exactly. It's the T statistic again.
02:10
How many degrees of freedom?
You may recall it from earlier, it was the
combined sample size minus the number of
variables. So ten plus eight minus two,
which gives us
16. Let's see the t statistic
formula. Once again, the difference between
sample means
minus the difference between hypothesized
true means divided by the standard error.
02:38
After plugging in everything, we get a test
statistic of 6.53.
02:45
Do we need to compare it?
This is by far the most extreme test
statistic we have seen.
02:51
You will have a hard time finding it in the
tea table.
02:55
For common tests. A rule of thumb is to
reject the null hypothesis when the RT score
is bigger than two.
03:02
Generally, for Z score and T score, a value
that is higher than four is extremely
significant. Let's see the two sided p
value.
03:13
The P value of this test is lower than
0.000.
03:17
Somewhere around 0.000001.
03:22
In our lesson about P value, we said that
researchers are always looking for those
three zeroes after the dot.
03:28
It means that the test is extremely
significant and the probability of making a
type one error is virtually zero.
03:36
Therefore, we reject the null hypothesis at
all common and uncommon levels of
significance. There is a strong statistical
evidence that the price of
apples in New York differs from in LA.
03:50
But such an extreme result may also mean
that the hypothesis is pointless or poorly
designed. From the mean values of 3.94 and
3.25,
and with such small and close standard
deviations of around 0.2, we could
easily say that the prices are different.
04:06
No testing needed.
04:08
A much more interesting question would be if
the price of apples in New York is 20%
higher than that in LA.
04:16
I will leave you this exercise for homework.
04:19
All right. We are done with hypothesis
testing.
04:23
Cheers. And thanks for watching.