00:01
Correlation adjusts covariance so that the
relationship between the two variables
becomes easy and intuitive to interpret.
00:08
The formulas for the correlation coefficient
are the covariance divided by the product of
the standard deviations of the two
variables.
00:16
This is either sample or population,
depending on the data you are working with.
00:20
We already have the standard deviations of
the two data sets.
00:24
Now we'll use the formula in order to find
the sample correlation coefficient.
00:30
Mathematically, there is no way to obtain a
correlation value greater than one or less
than minus one. Remember the coefficient of
variation we talked about a couple of lessons
ago? Well, this concept is similar.
00:43
We manipulated the strange covariance value
in order to get something intuitive.
00:48
Let's examine it for a bit.
00:50
We got a sample correlation coefficient of
0.87.
00:53
So there is a strong relationship between
the two values.
00:57
A correlation of one also known as perfect
positive correlation means that the entire
variability of one variable is explained by
the other variable.
01:06
However, logically, we know that size
determines the price.
01:10
On average, the bigger house you build, the
more expensive it will be.
01:15
This relationship goes only this way.
01:18
Once a house is built, if for some reason it
becomes more expensive, its size
doesn't increase, although there is a
positive correlation.
01:27
Ok a correlation of zero between two
variables means that they are absolutely
independent from each other.
01:33
We would expect a correlation of zero
between the price of coffee and Brazil and
the price of houses in London.
01:39
Right. The two variables don't have anything
in common.
01:44
Finally, we can have a negative correlation
coefficient.
01:48
It can be perfect negative correlation of
minus one, or much more likely an imperfect
negative correlation of a value between
minus one and zero.
01:57
Think of the following businesses, a company
producing ice cream and a company selling
umbrellas. Ice cream tends to be sold more
when the weather is very good, and people buy
umbrellas when it's rainy.
02:08
Obviously, there is a negative correlation
between the two and hence when one of the
companies makes more money, the other won't.
02:17
All right. Before we continue, we must note
that the correlation between two
variables X and Y is the same as the
correlation between Y and X.
02:26
The formula is completely symmetrical with
respect to both variables.
02:30
Therefore, the correlation of price and size
is the same as the one of size and
price. This leads us to causality.
02:38
It is very important for any analyst or
researcher to understand the direction of
causal relationships in the housing
business.
02:45
Size causes the price and not vice versa.
02:49
We will explore this topic in much more
detail in the regression analysis section
later on. For now, it is only needed that
you realize that
correlation does not imply causation.
03:01
Okay. Very good.
03:03
With this example, we conclude the section
on descriptive statistics.
03:08
In the next lesson, you will see a real life
database example that applies all the
knowledge you acquired in this section.
03:15
You definitely don't want to miss it.