Playlist

Central Limit Theorem

by 365 Careers

My Notes
  • Required.
Save Cancel
    Report mistake
    Transcript

    00:00 Before we continue, let's introduce a concept, a sampling distribution.

    00:06 Say you have the population of used cars in a car shop.

    00:09 We want to analyze the car prices and be able to make some predictions on them.

    00:14 Population parameters which may be of interest are mean car price, standard deviation of prices, covariance and so on.

    00:23 Normally in statistics we would not have data on the whole population, but rather just a sample. Let's draw a sample out of that data.

    00:32 The mean is $2,617.23.

    00:38 Now a problem arises from the fact that if I take another sample, I may get a completely different mean $3,201.34.

    00:49 Then a third with a mean of $2,844.33.

    00:56 As you can see, the sample mean depends on the incumbents of the sample itself.

    01:00 So taking a single value, as we did in descriptive statistics, is definitely suboptimal. What we can do is draw many, many samples and create a new data set comprised of sample means.

    01:14 These values are distributed in some way.

    01:16 So we have a distribution.

    01:19 When we are referring to a distribution forme by samples, we use the term a sampling distribution. For our case, we can be even more precise.

    01:28 We are dealing with a sampling distribution of the mean.

    01:33 So far, so good.

    01:35 Now, if we inspect these values closely, we will realize that they are different but are concentrated around a certain value.

    01:43 Right? For our case, somewhere around $2,800.

    01:48 Since each of these sample means are nothing but approximations of the population mean, the value they revolve around is actually the population mean itself.

    01:58 Most probably none of them is the population mean, but taken together, they give a really good idea.

    02:05 In fact, if we take the average of those sample means, we expect to get a very precise approximation of the population mean.

    02:12 Great. Let me give you some more information.

    02:17 Here's a plot of the distribution of the car prices.

    02:20 We haven't seen many distributions, but we know that this is not a normal distribution.

    02:26 It has a right skew, and that's about all we can see.

    02:30 Here's the big revelation.

    02:33 It turns out that if we visualize the distribution of the sampling means we get something else. Something familiar, something useful.

    02:42 A normal distribution.

    02:45 And that's what the central limit theorem states.

    02:49 No matter the distribution of the population binomial uniform exponential or another one, the sampling distribution of the mean will approximate a normal distribution. Not only that, but it's mean as the same as the population mean.

    03:05 That's something we already noticed.

    03:07 What about the variance? Well, it depends on the size of the samples we draw, but it is quite elegant.

    03:14 It is the population variance divided by the sample size.

    03:19 Since the sample size is in the denominator, the bigger the sample size, the lower the variance. Or, in other words, the closer the approximation we get.

    03:29 So if you are able to draw bigger samples, your statistical results will be more accurate. Usually for CLT to apply we need a sample size of at least 30 observations.

    03:42 Great. Finally, let's finish off with why the central limit theorem is so important.

    03:49 As we already know, the normal distribution has elegant statistics and an unmatched applicability in calculating confidence intervals and performing tests.

    03:58 The Central Limit theorem allows us to perform tests, solve problems and make inferences using the normal distribution, even when the population is not normally distributed. The discovery and proof of the theorem revolutionized statistics as a field, and we will be relying on it a lot in the subsequent lectures.

    04:17 That's all for now.

    04:19 Thanks for watching.


    About the Lecture

    The lecture Central Limit Theorem by 365 Careers is from the course Statistics for Data Science and Business Analysis (EN).


    Author of lecture Central Limit Theorem

     365 Careers

    365 Careers


    Customer reviews

    (1)
    5,0 of 5 stars
    5 Stars
    5
    4 Stars
    0
    3 Stars
    0
    2 Stars
    0
    1  Star
    0