00:01
It's time for a short break from all these
numbers and calculations.
00:05
I would like to tell you a story.
00:09
William Gosset was an English statistician
who worked for the brewery of Guinness.
00:15
He developed different methods for the
selection of the best yielding varieties of
barley, an important ingredient when making
beer.
00:23
Gosset had found big samples tedious, so he
was trying to develop a way to
extract small samples, but still come up
with meaningful predictions.
00:32
He was a curious and productive researcher
and published a number of papers that are
still relevant today.
00:39
However, due to the Guinness company policy,
he was not allowed to sign the papers with
his own name. Therefore, all of his work was
under the pen name
student. Later on, a friend of his and a
famous
statistician, Ronald Fisher, stepping on the
findings of Gosset, introduced
the T statistic and the name that stuck with
the corresponding distribution.
01:02
Even today is student's T.
01:05
The student's T distribution is one of the
biggest breakthroughs in statistics as it
allowed inference through small samples with
an unknown population variance.
01:15
This setting can be applied to a big part of
the statistical problems we face today and
is an important part of this course.
01:24
Visually, the student's T distribution looks
much like a normal distribution, but
generally has fatter tails.
01:32
Fatter tails, as you may remember, allows
for a higher dispersion of
variables, and there is more uncertainty.
01:40
In the same way that the Z statistic is
related to the standard normal distribution,
the T statistic is related to the student's
T distribution.
01:49
The formula that allows us to calculate it is
T with n minus 1
degrees of freedom and a significance level
of alpha equals the sample
mean minus the population mean divided by
the standard error of
the sample. As you can see, it is very
similar to the Z
statistic. After all, this is an
approximation of the normal distribution.
02:14
The last characteristic of the student's T
statistic is that there are degrees of
freedom. Usually for a sample of MN, we have
MN minus
1 degrees of freedom.
02:26
So for a sample of 20 observations, the
degrees of freedom are
19. Much like the standard normal
distribution table.
02:35
We also have a student's T table.
02:38
Here it is. The rows indicate different
degrees of freedom.
02:43
Abbreviated as DF while the column's common
alphas.
02:48
Please note that after the 30th row, the
numbers don't vary that much.
02:53
Actually, after 30 degrees of freedom, the
statistic table becomes almost the
same as the Z statistic.
03:01
As the degrees of freedom depend on the
sample.
03:03
In essence, the bigger the sample, the
closer we get to the actual numbers.
03:08
A common rule of thumb is that for a sample
containing more than 50 observations, we
use the Z table instead of the T table.
03:16
All right, great.
03:18
In our next lecture, we will apply our new
knowledge in practice.