00:00
All right, great.
00:02
Now that we know what the P value is and how
to use it, we will get back to hypothesis
testing. We saw only one of two possible
cases.
00:11
Remember, we haven't covered the more
commonly observed case when the population
variance is unknown.
00:18
All right. Imagine you are the marketing
analyst of a company and that you've been
asked to estimate if the email open rate of
one of the firm's competitors is above your
company's. Your company has an open rate of
40%.
00:33
An email open rate is a measure of how many
people on the email list actually open the
emails they ever see.
00:40
At first, you struggle to figure out how to
get such specific information about a
competitor company.
00:46
But then you see that an employee of that
competitor company posted a selfie on
Facebook saying lol.
00:52
The email management software we are using
drives me nuts.
00:57
In the background. You can see her screen,
and it shows clearly the summaries of the
last ten email campaigns that were sent and
their corresponding open
rates. Bingo.
01:09
With your statistical skills, that's all you
need.
01:12
A little help from Facebook.
01:16
Let's state the hypotheses.
01:18
Null hypothesis.
01:20
Mean open rate is lower or equal to 40%.
01:24
Alternative hypothesis mean open rate is
higher than 40%.
01:30
Note that in hypothesis testing, we are
aiming to reject the null hypothesis
when we want to test if the open rate is
higher than 40%.
01:38
The null hypothesis actually states the
opposite statement.
01:42
Also pay attention that this time, we are
dealing with a one sided test.
01:49
All right. Your boss told you that 0.05 is
an adequate significance
level for this test, so that's what you'll
use.
01:59
Here's the data set.
02:00
You calculate the sample mean and get 37.7%.
02:05
The sample standard deviation is 13.74%.
02:09
Thus, the standard error is 4.34%.
02:14
You assume that the population of open rates
of sent emails is normally distributed?
Like confidence intervals with variants,
unknown and a small sample.
02:23
The correct statistic to use is the T
statistic.
02:27
Remember, you do not know the variance and
the sample is not big enough.
02:32
This means that the variable follows the
student's T distribution, and you must employ
the T statistic.
02:40
Let's calculate it. Then we calculate the T
score the same way as the Z score.
02:46
The score is equal to the sample mean minus
the hypothesized mean
value divided by the standard error.
02:55
The result that we get is -0.53.
03:00
As we said earlier, it is easier to work
with positive numbers.
03:03
So we should compare the absolute value of
-0.53 with the
appropriate T with n minus one degrees of
freedom at
0.051 sided significance.
03:17
We quickly navigate through the table and get
1.83 at the 5%
significance critical value.
03:25
Ok 0.53 is lower than 1.83.
03:30
Remember the decision rule?
If the absolute value of the T score is
lower than the statistic from the table,
we cannot reject the null hypothesis.
03:40
Therefore, we must accept it.
03:44
What you do next is you go and tell your
boss that at this level of significance,
statistically, we cannot say that the email
open rate of our competitors is higher than
40%. Ok What
about the second measurement we saw?
What was that?
Ah yes.
04:02
The P value.
04:04
The p value of this statistic is 0.30 for.
04:09
As the p value is greater than the
significance level of 0.05, we
come to the same conclusion.
04:16
We cannot reject the null hypothesis.
04:20
Let's do a quick check.
04:23
If the significance level was 0.01, the p
value would still be higher, and we
wouldn't reject the null hypothesis.
04:30
This is an important observation that we
haven't noted before.
04:34
If we cannot reject a test at 0.05
significance, we cannot reject it at
smaller levels either.
04:42
All right. That's all for now.
04:43
Make sure you learn the material by doing
the exercises after this lesson.
04:48
Thanks for watching.