Playlist

Hypothesis Tests: Determination of Rejection region and Significance Level

by 365 Careers

My Notes
  • Required.
Save Cancel
    Report mistake
    Transcript

    00:01 Hi again. So you know what a hypothesis is and you have an idea of how to form the null and alternative hypothesis.

    00:09 By the end of this lesson, we will understand the reason why hypothesis testing works. First, we must define the term significance level.

    00:19 Normally we aim to reject the null if it is false, right.

    00:23 However, as with any test, there is a small chance that we could get it wrong and reject the null hypothesis.

    00:29 That is true. The significance level is denoted by Alpha and is the probability of rejecting the null hypothesis if it is true.

    00:37 So the probability of making this error.

    00:41 Typical values for alpha are 0.010.05 and 0.1.

    00:49 It is a value that you select based on the certainty you need.

    00:53 In most cases, the choice of alpha is determined by the context you are operating in. But 0.05 is the most commonly used value.

    01:02 Let's explore an example.

    01:04 Say you need to test if a machine is working properly.

    01:08 You would expect the test to make little or no mistakes.

    01:11 As you want to be very precise, you should pick a low significance level, such as 0.01. The famous Coca Cola glass bottle is 12 ounces. If the machine pours 12.1 ounces, some of the liquid will be spilled and the label would be damaged as well.

    01:29 So in certain situations, we need to be as accurate as possible.

    01:34 However, if we are analyzing humans or companies, we would expect more random or at least uncertain behavior and hence a higher degree of error.

    01:43 For instance, if we want to predict how much Coca Cola its consumers drink, on average, the difference between 12 ounces and 12.1 ounces will not be that crucial.

    01:52 So we can choose a higher significance level like 0.05 or 0.1. Now that we have an idea about the significance level, let's get to the mechanics of hypothesis testing. Imagine you are consulting a university and want to carry out an analysis on how students are performing on average.

    02:15 The university. Dean believes that on average, students have a GPA of 70%.

    02:20 Being the data driven researcher that you are, you can't simply agree with his opinion.

    02:25 So you start testing.

    02:27 The null hypothesis is the population mean grade is 70%.

    02:33 This is a hypothesized value, and we denote it with MU zero.

    02:38 The alternative hypothesis is the population mean grade is not 70%.

    02:43 So you zero defers from 70%.

    02:48 All right. Assuming that the population of grades is normally distributed, all grades received by students should look this way.

    02:56 That is the true population mean.

    02:58 Now a test we would normally perform is the Z test.

    03:02 The formula is Z equals the sample mean minus the hypothesized mean divided by the standard error.

    03:11 The idea is the following.

    03:14 We are standardizing or scaling the sample mean we got.

    03:17 If the sample mean is close enough to the hypothesized mean, then Z will be close to zero. Otherwise, it will be far away from it.

    03:26 Naturally, if the sample mean is exactly equal to the hypothesized, mean Z will be zero. In all these cases, we would accept the null hypothesis.

    03:37 Okay. The question here is the following.

    03:40 How big should be for us to reject the null hypothesis.

    03:45 Well, there is a cutoff line.

    03:47 Since we are conducting a two sided or a two tailed test, there are two cutoff lines, one on each side. When we calculate Z, we will get a value.

    03:57 If this value falls into the middle part, then we cannot reject the null.

    04:01 If it falls outside in the shaded region, then we reject the null hypothesis.

    04:06 That is why the shaded part is called rejection region.

    04:12 The area that is cut off actually depends on the significance level.

    04:17 The level of significance alpha is 0.05.

    04:20 Then we have alpha divided by two or 0.0 to 5 on the left side and 0.0 to 5 on the right side.

    04:30 Now these are values we can check from the Z table.

    04:33 When Alpha is 0.0 to 5, Z is 1.96.

    04:38 So 1.96 on the right side and -1.96 on the left side.

    04:44 Therefore, if the value we get for Z from the test is lower than -1.96 or higher than 1.96, we will reject the null hypothesis, otherwise we will accept it.

    04:57 That's more or less how hypothesis testing works.

    05:01 We scale the sample mean with respect to the hypothesized value.

    05:06 If Z is close to zero, then we cannot reject the null.

    05:09 If it is far away from zero, then we reject the null hypothesis.

    05:15 Ok. What about one sided tests? We have those too.

    05:20 Let's take the example from last lecture.

    05:23 Paul says data scientists earn more than 125,000.

    05:28 So h zero is MU zero is bigger than 125,000. The alternative is that zero is lower or equal to 125,000.

    05:42 Using the same level of significance this time.

    05:44 The whole rejection region is on the left.

    05:48 So the rejection region has an area of alpha.

    05:52 Looking at the Z table that corresponds to a Z score of 1.645, and since it is on the left, it is with a minus sign.

    06:01 Now when calculating our test statistic Z, if we get a value lower than -1.645, we would reject the null hypothesis as we have statistical evidence that the data scientist salary is less than 125,000.

    06:16 Otherwise, we would accept it.

    06:20 All right. To exhaust all possibilities, let's explore another one tale test. Say the university dean told you that the average GPA students get is lower than 70%.

    06:33 In that case, the null hypothesis is MU zero is lower than 70%, while the alternative means zero is bigger or equal to 70%. In this situation, the rejection region is on the right side. So if the test statistic is bigger than the cutoff Z score, we would reject the null.

    06:55 Otherwise, we wouldn't.

    06:58 Cool. That's all for now.

    07:00 In a lesson or two, we'll start testing.

    07:03 Just hold on a bit, and thanks for watching.


    About the Lecture

    The lecture Hypothesis Tests: Determination of Rejection region and Significance Level by 365 Careers is from the course Statistics for Data Science and Business Analysis (EN).


    Author of lecture Hypothesis Tests: Determination of Rejection region and Significance Level

     365 Careers

    365 Careers


    Customer reviews

    (1)
    5,0 of 5 stars
    5 Stars
    5
    4 Stars
    0
    3 Stars
    0
    2 Stars
    0
    1  Star
    0