How to Find Percentile With Mean and Standard Deviation
Attention:This post was written a few years ago and may not reflect the latest changes in the AP® program. We are gradually updating these posts and will remove this disclaimer when this post is updated. Thank you for your patience!
Z-score Calculations Introduction
One of the challenges in preparing for the AP® Statistics exam is that the concepts build upon one another. Some statistical tests involve several steps, combining earlier and simpler concepts into more complex ones. As a result, failing to understand any one of the earlier ideas in the course can mean big trouble when it comes time for the exam. Z-score calculations are a perfect example of this. As one of the core skills in AP® Statistics, z-score calculations require you to combine much of what is covered in the first half of the course. Use this AP® Statistics review to be sure you understand everything you need to beat z-score questions on the exam. We will tell you all of the concepts related to z-scores, show you how to perform z-score calculations using sample questions, and explain percentiles in a normal distribution.
What is a Z-score?
A z-score shows you the distance between an observed score and the mean in units of standard deviations. These terms may sound a bit complicated right now if they are new to you. However, it's very simple to perform once you understand all of the concepts that lead up to z-score calculations.
Ingredients for Z-score Calculations
Performing statistical tests, like z-score calculations, is a bit like cooking. All you need to do is follow the recipe, but first, you need to have all the ingredients! Below are all the concepts you should understand in order to fully grasp z-score calculations. We'll review these ingredients first to find out where your gaps in knowledge are. Then, we'll go over the major concepts one at a time.
- Frequency Distributions
- Density Curves & Probability
- Normal Distribution
- Mean and Standard Deviation
- P-values
Frequency Distributions
A frequency distribution is a table showing the number of observations of each outcome along a given dimension. It can be represented graphically in several ways, including histograms and line charts. For example, we may measure the number of students in our class of 50 students who earned each possible letter grade, which could look like this:
Letter grade | A | B | C | D | F |
Number grade | 90-100 | 80-90 | 70-80 | 60-70 | <60 |
Number of students | 5 | 10 | 20 | 10 | 5 |
Plotted as a histogram, it would look like this:
Density Curves & Probability
A density curve looks similar to a frequency distribution, but it represents the probability of observing each of the outcomes. Probabilities are fractions of 1. To understand this, consider the probability of observing the grade A in the example above. There are 50 students, and 5 of them earned A's, so the probability of observing an A in this class is \dfrac{5}{50} = \dfrac{1}{10}=0.1. Below, we've plotted the probability of each outcome to illustrate that the total probability adds up to 1.
Letter grade | A | B | C | D | F | Total |
Number grade | 90-100 | 80-90 | 70-80 | 60-70 | <60 | |
Probability | 0.1 | 0.2 | 0.4 | 0.2 | 0.1 | 1.0 |
Plotted as a curve, it looks something like this:
Since the total probability of all possible outcomes is 1, the area under the curve is also 1. With z-scores, we will use the concept of area under the curve to determine the probability of various outcomes.
Standard Normal Distribution
A normal distribution is one that is symmetrical and bell-shaped, like the examples we've seen here. The standard normal distribution is a special type, having a mean of 0 and a standard deviation of 1, like the one below. In calculating z-scores, we convert a normal distribution into the standard normal distribution—this process is called "standardizing." Since distributions come in various units of measurement, we need a common unit in order to compare them. The standard unit used to compare different distributions is the standard deviation.
Mean and Standard Deviation
The mean and the standard deviation are the two main ingredients that go into calculating the z-score. The mean is a measure of the center of a distribution (see our other blog post for a review of means). The standard deviation is a measure of the spread of a distribution—it shows the average distance of each observation from the mean.
In statistics, we represent the mean and standard deviation using letters from the Greek alphabet.
The symbol for mean is \mu
The symbol for standard deviation is \sigma
The standard deviation is important for z-scores because it tells us whether a score is close or far away from the mean. Imagine a class takes a test, and the mean score is 50%, but student S scored 75%. Is this a good or a bad score? Well, if every other student scored between 45-55%, the distribution has a small standard deviation, and suddenly S's score seems a lot more impressive! On the other hand, if the distribution of scores is more spread out (large standard deviation) and falls between 0-100%, S is no longer happy about his score.
As we discussed above, the standard normal distribution has a mean of 0 and a standard deviation of 1. Z-scores are represented in units of standard deviations. A z-score of 1 means that an observation is 1 standard deviation away from the mean. So, in the example above, if the standard deviation is 15, S's score of 75 is 1 standard deviation away from the mean of 50—he has a z-score of 1. If the standard deviation is 5, S's score is now 3 standard deviations away from the mean—he would have a z-score of 3.
P-values
The "p" stands for "probability." P-values represent the probability of observing a specific z-score. Just as the probability of observing an A was lower than the probability of earning a C in our original example, the probability of observing a z-score of 3 is lower than the probability of observing a z-score of 1. The larger the z-score, the smaller the probability! This stems from the fact that the further away you get from the mean, the more unlikely the scores become.
Z-scores can be converted into p-values (and vice versa) by using a simple table that is found in the back of any statistics textbook. If you're not familiar with using a z-table, see this short video for a review.
Percentiles in a Normal Distribution – 68-95-99.7 Rule
Instead of always using a z-table, there is also a convenient rule for estimating the probability of a given outcome. It is called the "68-95-99.7 Rule." This rule means that 68% of the observations fall within 1 standard deviation of the mean, 95% fall within 2 standard deviations, and 99.7% fall within 3 standard deviations. That means the probability of observing an outcome greater than 3 standard deviations from the mean is very low: 0.3%
Performing Z-score Calculations
Now that you have all the ingredients, you're ready for the recipe! The formula for a z-score looks like this:
z=\dfrac{x-\mu}{\sigma} x represents an observed score, also known as a "raw score."
As previously mentioned, \mu represents the mean and \sigma represents the standard deviation.
To calculate a z-score, we simply subtract the mean from a raw score and then divide by the standard deviation. (On exam questions, the mean and standard deviation may be provided, or you may need to calculate them, so make sure you know how to do that!) Then, we take our z-score and check the z-table to find the p-value of that score.
There are several ways you may be asked to use z-scores on the AP® Statistics exam. You may have to compare scores in two distributions, find the probability of a certain observation, or find the probability of an interval between two observations. You can also go in reverse, using p-values to find z-scores and then raw scores. Let's try some examples!
Example 1
Tom is a sprinter, and Alex is a long-jumper. They both compete at the track meet this weekend, along with 4 other athletes in each of their respective events. Tom thinks he is a better athlete than Alex is. Is there evidence for his claim? (Note: we are assuming that sprint times and long-jump distances are normally distributed. Otherwise, we can't use z-score calculations!)
The sprint times in seconds are as follows:
Athlete | Tom | Athlete 2 | Athlete 3 | Athlete 4 | Athlete 5 |
Time (sec) | 15 | 17 | 14 | 19 | 16 |
Calculating the mean and standard deviation we find:
\mu=16.2
\sigma=1.92
The long-jump distances in feet are as follows:
Athlete | Alex | Athlete 2 | Athlete 3 | Athlete 4 | Athlete 5 |
Distance (ft) | 23 | 24 | 20 | 21 | 19 |
Calculating the mean and standard deviation we find:
\mu=21.4
\sigma=2.07
Tom's z-score is: z=\dfrac{15-16.2}{1.92}=-0.625
Alex's z-score is: z=\dfrac{23-21.4}{2.07}=0.77
Keep in mind that Tom is racing—he wants to have a smaller score than his competitors, whereas Alex is going for greater distance. Tom is 0.625 standard deviations below the mean, and Alex is 0.77 standard deviations above the mean. That means that Alex is actually the better athlete relative to his competition!
We can also use our z-table to find the probability of earning each score. Based on the z-scores we calculated above, the p-value of an athlete running as fast or faster than Tom did is .26. Similarly, the p-value of an athlete jumping as far or farther than Alex did is .22.
Example 2
What if we wish to find the probability of scoring within a certain range? For example, what is the probability of a student scoring between 85-90% on a test if the mean is 80% and the standard deviation is 5%?
First, we find the z-scores for both sides of our range.
z=\dfrac{85-80}{5}=1
z=\dfrac{90-80}{5}=2
We are looking for the probability of the shaded area under the curve, pictured below.
Z-tables can vary in the information they display, but they generally show the area above a given score, the area below a given score, and sometimes the area between the mean and the score. This information gives us several methods of solving the problem, but all of them involve simple subtraction. For example, we can take the area below our higher score, which the z-table tells us is 0.9772:
And subtract the area below our lower score, which the z-table tells us is .8413:
Which leaves us with just the area between the two values we're interested in: .1359. That means there is a 13.59% probability of a student scoring between 85-90% on this exam.
Congratulations! You now have a handle on every concept you need to know when it comes to z-score calculations on the AP® Statistics exam!
Looking for AP® Statistics practice?
Check out our other articles on AP® Statistics .
You can also find thousands of practice questions on Albert.io. Albert.io lets you customize your learning experience to target practice where you need the most help. We'll give you challenging practice questions to help you achieve mastery of the AP® Statistics.
Start practicing here .
Are you a teacher or administrator interested in boosting AP® Statistics student outcomes?
Learn more about our school licenses here.
How to Find Percentile With Mean and Standard Deviation
Source: https://www.albert.io/blog/z-score-calculations-percentiles/