Topic:More on Confidence Intervals
From SharedExperienceProject
Contents |
Topic Highlights
(What you will learn)
- More about the concept of confidence intervals
- The basics of the so-called t-distribution
- How to apply confidence intervals to cases where both the population mean,
, and the population standard deviation,
, are unknown
- How to select the necessary sample size
Introduction and Motivation
(Why learn it)
In the topic Confidence Intervals, we were introduced to the basics of a confidence interval. We looked at what the term confidence interval means, at how to compute one for a given confidence level, e.g. 95%, and at how a confidence interval can be used to quantify the precision of an estimate,
, of a population's mean,
.
In this topic, we take a deeper look at confidence intervals and we look at how they can be used when the population standard deviation is also estimated.
Learning Activities
(How the levels of understanding will be gained)
| Type | Name | Direction |
| Reading |
| Self-directed |
| In-class worksheet | Self-directed | |
| Lecture and discussion |
| Instructor-directed |
| Personal activities |
| Self-directed |
Learning Objectives
(Levels of understanding to be gained)
| Level of Understanding | Objective(s) |
| Very best |
|
| Highly satisfactory |
|
| Satisfactory | |
| Maybe just enough to pass |
|
Topic Notes: An Overview of Confidence Intervals
These notes are intended to facilitate a review of confidence intervals and to introduce the t-distribution.
The goals are: 1) to review the basic concepts studied in the introductory topic Confidence Intervals, and 2) to use this to segue into the confidence interval in cases where
is not known.
Confidence Intervals for a Normal Population
You should remember from The Basics of Confidence Intervals that if a random variable is normally distributed with known mean,
, and standard deviation,
, then a confidence interval is given by the following equation:
CI = [ - z
,
+ z
]
where z can be found by looking in the standard normal table for a given confidence interval, e.g. for 90%, we would look up the value 0.50 - 0.45 = 0.05, to get z = 1.645.
Here the margin of error is:
E = z
This is for the case of a population with known normal parameters, as shown conceptually below:
Confidence Intervals for a Mean - Population Standard Deviation Known
You should remember from Confidence Intervals for the Mean of a Normal Population with Known Standard Deviation that if the population average
is unknown, then we can approximate it using the sample mean,
, taken from a sample of size n. This is shown conceptually below:
If we do this, the natural question is: how precisely does
represent
?
We answer this with a confidence interval for the mean for some confidence level of interest:
CI = [- z
,
+ z
]
which is the same as:
CI = [ - z (
),
+ z (
)
Here, the margin of error is:
E = z ()
Confidence Intervals for a Mean - Population Standard Deviation Unknown
What do you do if both
and
are unknown as shown below? How do you represent the precision with which
represents
in this case?
Many students reply to this by suggesting that we use the sample standard deviation, s, and just substitute it into the above CI equation in place of
.
This almost works, but not quite.
As long as the original population is normally distributed, the actual equation for the confidence interval is as follows:
CI = [ - t (
),
+ t (
) ]
As you can see, the difference is the "t" instead of "z". This is because the random variable
is no longer normally distributed (z). Because we are using the sample standard deviation, s, it is now distributed according to the so-called Student's t Distribution. For short, we refer to this as just the t distribution.
We'll take a closer look at this distribution in the next section.
In this case, the margin of error is:
E = t ()
Topic Notes: The t Distribution
These topic notes are intended to facilitate a discussion of Section 7.2 of Bowerman et al. and Section 7.4 of Kvanli et al.
What is the t distribution?
The t distribution is symmetric about zero and looks a lot like the normal distribution.
In fact, the main difference is that the shape of the t distribution depends on the size of the sample through a parameter called degrees of freedom:
degrees of freedom = df = n - 1
where n is the sample size.
The difference between the t and normal distributions is shown below for 20 degrees of freedom, i.e. for a sample size of 21:
Notice that the t distribution is shorter and a little bit wider at the tails.
As the sample size increases, the t-distribution tends toward the standard normal distribution. In fact, for n > 30, there is little difference between the two, as shown below:
When to use the t distribution
- then you should approximate it by the sample standard deviation, s
- and you should use the t distribution
- then you should use the normal distribution
Note however that if n > 30:
Using the t tables
As there was for the standard normal distribution, there is a table for the t distribution. In Kvanli et al., the t table is given as Table A.5.
The t table is very simple to use and allows you to look up the t value for:
- the known degrees of freedom corresponding to the problem
- a given area, or probability, as shown below
Notice that some t tables label the shaded area
and label the values of t
. Don't let this throw you: the right-tail area is the same whatever the label. Have a look at this in the next example.
Example 1
What is the t-value for n=12 and
/2 = 0.05?
| Solution |
|---|
|
The degrees of freedom, df = 12-1 = 11 Using this and Table A.5, you should get: t = 1.796 |
Example 2
If n=12 again, what is the t-value corresponding to a confidence level of 80%?
Example 3
What are the right-tail areas for each of the following confidence levels?
a) 80%
b) 90%
c) 95%
d) 98%
e) 99%
| Solution |
|---|
|
a) 0.100 b) 0.050 c) 0.025 d) 0.010 e) 0.005 |
Summary
The following table summarizes some of our findings here:
| Confidence level | Associated right-side area ( |
| 80% | 0.100 |
| 90% | 0.050 |
| 95% | 0.025 |
| 98% | 0.010 |
| 99% | 0.005 |
Tying it all Together
Example 4
What is the 95% confidence interval for the mean in a situation where
and
are both unknown, and where n = 21, s = 12,
= 96?
Topic Notes: Selecting the Necessary Sample Size
These notes are intended to facilitate a discussion of Section 7.3 of Bowerman et al. and Section 7.5 of Kvanli et al.
For known population standard deviation
Most of the problems you will encounter will be for the case where
is known.
In this case, the required sample size, n, is given by the following equation:
where z is the z-score value corresponding to a given level of confidence (recall the summary of these),
is the population standard deviation, and E is some desired margin of error.
For unknown population standard deviation
If
is not known, then you can replace it with s as long as the resulting sample size is greater than 30:
Example 5
For a population that is normally distributed with
= 12, what sample size is required to achieve a margin of error, E=3, if the desired confidence level is 95%?
| Solution |
|---|
|
For a confidence level of 95%, you should recognize that z = 1.96. Then, you just plug the values into the first of the above equations: n = (1.96x12/3)2 = 61.47 which is rounded to 62 |
Practice Problems
Practice Problem 1 (OE)
Students in a first year psychology class are required to maintain an average of 55% to pass. To determine whether this average is being maintained, the instructor took a random sample of 15 students and found the average to be 52% and the standard devation to be 12%.
a) What is the margin of error for the true average grade for a confidence level of 90%?
b) What is the 90% confidence interval for the true average grade in the above case?
| Solution |
|---|
|
The confidence interval for the mean when CI = [ = [ 52 - 5.4563, 52 + 5.4563 ] = [ 46.5437, 57.4563 ] |
Practice Problem 2 (OE)
If the same instructor wants to redo his survey and ensure a margin of error no larger than 3, what sample size would you recommend he uses if he wants to maintain the confidence level of 90%?
| Solution |
|---|
|
Because For a confidence level of 90%, we know that z = 1.645 (see the summary here if you need reminding of why this is so). We were told in the problem that s = 12. We can then substitute into the above for a desired E = 3: n = (1.645*12/3)2 = 43.2964 which rounds up to 44 This tells you that he would need to take 44 samples in order to ensure a margin of error no larger than 3% for the confidence level of 90%. You should also confirm that the resulting sample size is greater than 30, which it is here. |









