Normal Distribution Essay
Normal distribution is a statistics, which have been widely applied of all mathematical concepts, among large number of statisticians. Abraham de Moivre, an 18th century statistician and consultant to gamblers, noticed that as the number of events (N) increased, the distribution approached, forming a very smooth curve.
He insisted that a new discovery of a mathematical expression for this curve could lead to an easier way to find solutions to probabilities of, “60 or more heads out of 100 coin flips.” Along with this idea, Abraham de Moivre came up with a model that has a drawn curve through the midpoints on the top of each bar in a histogram of normally distributed data, which is called, “Normal Curve.”
One of the first applications of the normal distribution was used in astronomical observations, where they found errors of measurement. In the seventeenth century, Galileo concluded the outcomes, with relation to the measurement of distances from the star. He proposed that small errors are more likely to occur than large errors, random errors are symmetric to the final errors, and his observations usually gather around the true values. Galileo’s theory of the errors were discovered to be the characteristics of normal distribution and the formula for normal distribution, which was found by Adrian and Gauss, well applied with the errors. In 1778, Laplace, a mathematician and astronomer, discovered the same distribution. His “Central Limit Theorem” proved that even if the distribution is “roughly distributed”, the means of the repeated samples from the distribution is nearly normal, and the larger the size of the sample, the closer the distribution of means would be to a normal distribution. Quetelet, a statistician (astronomer, mathematician, and sociologist) was the earliest to use and apply the normal distribution to human characteristics such as weight, height, and strength.
Normal distribution, also known as Gaussian distribution, is a function that represents the distribution of a set of data as a symmetrical bell shaped graph. The graph is also known as the “bell curve.” Normal curve is a drawn curve through the midpoints of the tops of each bar in a histogram of normally distributed data. Normal curve shows the shape of normally distributed histogram. The graph of the normal distribution depends on two factors, which are mean (µ) and standard deviation (which determines the height and width of a graph). Normal distribution of data is continuous for all values of x between -∞ and ∞ so that the intervals of a real number doesn’t have a probability of zero (-∞ ≤ x ≤ ∞). A probability density function:
where x is a normal random variable, μ is the mean, σ is the standard deviation, π is approximately 3.1416, and e is approximately 2.7183. The graph of the equation must be greater than or equal to zero for all possible values. The area under the curve always equals 1. The notation N (µ, σ^2) means normally distributed with the mean µ and variance σ^.
The standard normal distribution is the distribution that occurs when a normal random variable has a mean of zero and a standard deviation of one. The normal random variable of a standard normal distribution is called a standard score (z-score). The z-score represents the number of standard deviations that a data value is away from the mean. To convert from normal distribution to the standard normal form, we use z-score formula:
x = the value that is being standardized (normal random variable). μ = the mean of the distribution (mean of x).
σ= standard deviation of the distribution (standard deviation of x). In order to calculate the standard deviation, we have to find the variance first. Variance is the average of the squared differences from the mean. To calculate the variance, take the average of the numbers (find the mean) and from each number, subtract the mean, square it and average the result (N). Standard Deviation is a measure of how spread out the numbers is (its symbol is σ). To calculate the standard deviation, take the square root of the variance. The “Population Standard Deviation” (Standard Deviation):
In order to calculate the sample standard deviation, we have to find the variance first. The only difference is we have to change is N to N-1 (which is called “Bessel’s correction”). After the change, calculate the variance and take the square root to find the sample standard deviation. The mean is now (sample mean), instead of μ (population mean) and the answer is s (sample Standard Deviation) instead of σ (standard deviation); however, (σ to s) and (μ to) does not affect the calculation. The “Sample Standard Deviation”:
Emperical Rule states that:
68% of values are within
1 standard deviation of the mean ()
95% of values are within 2 standard deviations of the mean ()
99.7% of values are within
3 standard deviations of the mean ()
The Empirical Rule indicates what percentages of values are within a certain range of the mean. Empirical Rule only applies when the data follows a normal distribution and these results are approximations. It is also known as the 68-95-99.7 Rule. Empirical Rule is used to describe a population (standard deviation), not a sample (standard deviation), but it can be used to help you decide whether a sample of data came from a normal distribution. You can check to see if the data follows the empirical rule (68-95-99.7) by checking if the sample is large enough to see that the histogram looks like a bell-shaped graph.
A standard normal distribution table shows a probability associated with z-score. Table rows show the whole number and tenths place of the z-score. Table columns show the hundredths place. The mean is zero and the standard deviation is one, the Z value is the number of standard deviation units away from the mean, and an area is the probability of observing a value less than the Z value. If we have to find P(Z > a), find the probability that a standard normal random variable (z) is greater than a given value by: P(Z > a) = 1 – P(Z < a).
If we have to find P(a < Z < b), find the probability that a standard normal random variables lies between two values by: P(a < Z < b) = P(Z < b) – P(Z < a).
1. In each of 25 votes, the students have a 60% chance of winning. What are the odds that the students will win 19 or more votes?
Np= 15, Npq = 6, so X ~ N (15, 6).
Find P(X ≥ 18.5).
Let Z = (X – 15)/√6. When x = 18.5, z = 3.5/√6 = 1.43.
P(X ≥ 18.5) = P(Z ≥ 1.43) = 1 – F(1.43) = 1 – .9236 = .0764.
Students have a little less than 8% chance of winning 19 or more votes.
2. The buildings are 600 in, 470 in, 170 in, 430 in and 300 in tall.
Find out the mean, variance, and standard deviation:
Mean: (600 + 470 + 170 + 430 + 300)/5 = 1970/5 = 394
Standard Deviation: σ = √21,704 = 147.32 = 147 (to the nearest in)
Find sample standard deviation: So 5-1=4
Sample variance = 108,520 / 4 = 27,130 Sample Standard Deviation = √27,130 = 164 (to the nearest in)
3. Find P(X ≤ 4)
4. 95% of trucks weigh between 1.1 ton and 1.7 ton.
The mean is halfway between 1.1 ton and 1.7 ton:
Mean = (1.1 ton + 1.7 ton) / 2 = 1.4 ton
95% is 2 standard deviations either side of the mean (a total of 4 standard deviations) so: 1 standard deviation = (1.7 ton-1.1 ton) / 4 = 0.6 ton / 4 = 0.15 ton
One of their trucks is 1.85 ton.
You can see on the bell curve that 1.85 ton is 3 standard deviations from the mean of 1.4, so: Truck’s weight has a “z-score” of 3.0
How far is 1.85 from the mean?
It is 1.85 ton- 1.4 ton= 0.45 ton from the mean
How many standard deviations is that? The standard deviation is 0.15 ton, so: 0.45 ton / 0.15 ton = 3 standard deviations