A normal distribution is a distribution whereby the mean, mode and median are all the same. We represent a normal distribution with mean $\mu$ and standard deviation $\sigma$ as: $$ \begin{equation}\begin{aligned} X\sim N(\mu,\sigma^2)\\ \end{aligned}\end{equation} $$
The graph is symmetric about the mean and the total area under the graph is $1$.
For example, if the masses of cows is normally distributed, with a mean of $55kg$, and a standard deviation of $3kg$, the distribution of the masses is represented as: $$ \begin{equation}\begin{aligned} X\sim N(55kg,9kg^2)\\ \end{aligned}\end{equation} $$
Notice that the variance is stated here with a unit ($kg^2$) which is the square of the unit for the quantity being modelled (mass of the cows). This is because variance ($\sigma^2$) is the square of standard deviation ($\sigma$) which has units $kg$.
The standard normal distribution
This is an important variation of the normal distribution whereby the mean is $0$ and the standard deviation is $1$: $$ \begin{equation}\begin{aligned} Z\sim N(0,1)\\ \end{aligned}\end{equation} $$
The relationship between any normal distribution and the standard normal distribution is that any value $X$ in the original distribution is: $$ \begin{equation}\begin{aligned} Z=\frac{X-\mu}{\sigma}\\ \end{aligned}\end{equation} $$
The $Z$ score corresponding to an $X$ value tells us how many standard deviations from the mean ($\mu$) that $X$ value is. For example, a $Z$ value of $1$ means that the $X$ is exactly $1$ standard deviation to the right of the mean.
Q1: What does a $Z$ value of $-2$ mean?
- $2$ s.d. away from the left tail of the graph
- $2$ s.d. to the right of the mean
- $1$ s.d. to the right of the mean
- $2$ s.d. to the left of the mean
A negative $Z$ score implies that the value is to the left of the mean
Phi tables
We can find the probability that an $X$ value chosen at random is less than another $x_1$ by finding the corresponding value $a$ on the standard normal distribution: $$ \begin{equation}\begin{aligned} P(X<x_1)&=P(Z<a)\\ \end{aligned}\end{equation} $$
The probability that $Z<a$ can be read from a table of standard values: $$ \begin{equation}\begin{aligned} P(Z<a)=\phi(a)\\ \end{aligned}\end{equation} $$
There are multiple properties that are emergent from the this idea when combined with what we know about normal distributions (symmetry about the mean and total area being $1$):
- $P(Z>a)=1-P(Z<a)=1-\phi(a)$
- $P(Z<-a)=P(Z>a)$
- $P(a<Z<b)=P(Z<b)-P(Z<a)=\phi(b)-\phi(a)$
- $P(-b<Z<-a)=P(a<Z<b)$
Sometimes we can be given phi tables with only positive or only negative values. In the case of being given a positive phi table, it would be impossible to find the phi of a negative value but we can use the properties of the graph to conclude that: $$ \begin{equation}\begin{aligned} \phi(-a)&=P(Z<-a)\\ &=P(Z>a)\\ &=1-P(Z<a)\\ &=1-\phi(a)\\ \end{aligned}\end{equation} $$
Therefore: $$ \begin{equation}\begin{aligned} \phi(-a)&=1-\phi(a)\\ \phi(a)&=1-\phi(-a)\\ \end{aligned}\end{equation} $$
The Empirical Rule
This rule states that for a normal distribution:
- 68% of values fall within $1$ standard deviation of the mean
- 95% of values fall within $2$ standard deviations of the mean
- 99.8% of values fall within $3$ standard deviations of the mean
Created using natural intelligence