00:01
After introducing confidence intervals and
showing how to calculate them, there were
several exercises.
00:07
You were asked to use the same data to find
other confidence intervals.
00:13
Let's take a step back and try to understand
confidence intervals a bit better.
00:18
Here is a graph of a normal distribution.
00:22
You know where the sample mean is in the
middle of the graph.
00:26
Now, if we know that a variable is normally
distributed, we are basically making the
statement that the majority of observations
will be around the mean and the rest far
away from it. Let's draw a confidence
interval.
00:41
There is the lower limit and the upper
limit.
00:45
A 95% confidence interval would imply that
we are 95% confident that the true
population mean falls within this interval.
00:55
There is 2.5% chance that it will be on the
left of the lower limit and
2.5% chance it will be on the right.
01:04
Overall, there is 5% chance that our
confidence interval does not contain
the true population mean.
01:11
So when alpha is 0.05 or 5%, we have
alpha divided by two or 2.5% chance that the
true mean is on the
left of the interval and 2.5% on the right.
01:27
Okay, great.
01:29
Using the Z score in the formula, we are
implicitly starting from a standard normal
distribution. Therefore, the mean is zero.
01:38
The lower limit is minus C, while the upper
one z.
01:43
For a 95% confidence interval using the Z
table, we can find that these limits
are -1.96 and 1.96.
01:53
That's exactly what we did in the previous
lecture.
01:58
Finally, the formula makes sure we go back
to the original range of values, and we get
the interval for our particular data set.
02:06
Okay. What if we are looking at a 90%
confidence interval?
In that case, the interval looks like this
and there is a 10% chance that the true mean
is outside the interval.
02:20
Actually 5% on each side.
02:22
This causes the confidence interval to
shrink.
02:25
So when our confidence is lower, the
confidence interval itself is
smaller. Similarly, for a 99% confidence
interval, we
would have a higher confidence, but a much
larger confidence interval.
02:40
Let's see an example, just to make sure we
have solidified this knowledge.
02:45
I don't know your age to your student, but I
am 95% confident that you are between 18
and 55 years old.
02:52
Based on the fact that you are taking an
online statistics course.
02:56
That's not much information to begin with.
02:58
Plus, I don't have any information about the
age of any of the students.
03:02
Hence, the wide interval.
03:05
Okay. So, I am 95% confident you are between
18 and 55
years old. Also, I am 99% confident
that you are between ten and 70 years old.
03:20
I am 100% confident that you are between
zero and 118
years old, which is the age of the oldest
person alive at the time of recording.
03:31
Finally, I am 5% confident that you are 25
years old.
03:37
Obviously, this is a completely arbitrary
number.
03:40
As you can see, there is a trade-off between
the level of confidence and the range of the
interval. 100% confidence interval is
completely
useless, as I must include all ages possible
in order to gain
100% confidence.
03:58
99% confidence gives me a much narrower
range, but it's still not insightful enough
for this particular problem.
04:05
25 years old, on the other hand, is a pretty
useful estimate as we have an
exact number.
04:10
But the level of confidence of 5% is too
small for us to make use of in any
meaningful analysis.
04:17
There is always a trade-off, which depends
on the problem at hand.
04:22
95% is the accepted norm, as we don't
compromise with accuracy too much
but still get a relatively narrow interval.
04:31
Great. After this short clarification, let's
carry on with more
statistical problems.
04:37
Thanks for watching.