00:01
Hi again. In the last few lessons, we've been
focusing on confidence intervals.
00:06
We'll do that here too.
00:08
This lesson is about independent samples
with variances unknown but assumed to be
equal. All right.
00:16
Think about this example.
00:18
You are trying to calculate the difference
of the price of apples in New York and LA.
00:24
You go to ten grocery shops in New York, and
your friend Paul, who lives in L.A.,
visits eight grocery shops in order to help
you with the research.
00:33
Once you've organized the data and a table,
you start reflecting on how you can create a
confidence interval that shows the
difference between the price of apples in New
York and LA. You don't know what the
population variance of apples in New
York or LA is, but you assume it should be
the same for NY and LA.
00:53
So you calculate the mean price in NY and LA
and obtain
$3.94 and $3.25 respectively.
01:03
Moreover, their sample standard deviations
are $0.18 and
$0.27. What should we do now?
Well, we assume that the population
variances are equal, so we have to estimate
them. The unbiased estimate or in this case
is called the pooled sample
variance, and we could use the following
formula to calculate it.
01:29
As you can see, it is based solely on the
sample sizes and the sample standard
deviations of the two data sets.
01:36
We quickly plug in the numbers and get a
pooled sample variance of
0.05 and a pooled standard deviation of
$0.22.
01:48
How about the statistic needed to form a
confidence interval?
Well, we have an unknown variance, so you
guessed it.
01:57
It's the T statistic.
02:00
Here's the formula for the confidence
interval.
02:05
Let's compare it to the formula for
independent samples with known variants.
02:12
We have the same difference of sample means,
but the variance is instead of the population
variances, we have the pooled sample
variance.
02:21
And then instead of the Z statistic, we have
the T statistic.
02:27
Although the formulas are different, they
are very consistent.
02:31
Right. You must be wondering about the PT
statistic
though. It is a bit stranger this time, so
let's quickly clarify it.
02:41
The degrees of freedom are equal to the
total sample size, minus the number of
variables. So far we have seen t's with n
minus
one degrees of freedom because we had a
sample size of NW and only one variable.
02:56
This time we have two sample sizes one of
ten and one of eight
observations and two variables.
03:03
Apple prices in New York and Apple Prices in
LA.
03:07
Therefore, we have 16 degrees of freedom.
03:12
All right. Checking the statistic from the
table.
03:16
For a 95% confidence interval with 16
degrees of freedom, we get
2.1 to.
03:23
Let's plug everything in and get our answer.
03:27
The 95% confidence interval is between 0.47
and
0.92. What's the interpretation?
Well, we are 95% confident that the actual
difference between the two
populations price of apples in New York and
in L.A.
03:43
is somewhere between 0.47 and 0.92.
03:49
Therefore, it is clear that apples in New
York are much more expensive than in
L.A.. Good job and thanks for watching.