Topic:More on Confidence Intervals

From SharedExperienceProject

Jump to: navigation, search

Contents

Topic Highlights

(What you will learn)

  • More about the concept of confidence intervals
  • The basics of the so-called t-distribution
  • How to apply confidence intervals to cases where both the population mean, Image:Mu_for_poisson.png, and the population standard deviation, Image:Sigma.png, are unknown
  • How to select the necessary sample size

Introduction and Motivation

(Why learn it)

In the topic Confidence Intervals, we were introduced to the basics of a confidence interval. We looked at what the term confidence interval means, at how to compute one for a given confidence level, e.g. 95%, and at how a confidence interval can be used to quantify the precision of an estimate, Image:Xbar.png, of a population's mean, Image:Mu_for_poisson.png.

In this topic, we take a deeper look at confidence intervals and we look at how they can be used when the population standard deviation is also estimated.

Learning Activities

(How the levels of understanding will be gained)

Learning activities for this topic
Type Name Direction
Reading
Self-directed
In-class worksheet Self-directed
Lecture and discussion
Instructor-directed
Personal activities Self-directed

Learning Objectives

(Levels of understanding to be gained)

Learning objectives for this topic
Level of Understanding Objective(s)
Very best
Highly satisfactory
Satisfactory
Maybe just enough to pass

Topic Notes: An Overview of Confidence Intervals

These notes are intended to facilitate a review of confidence intervals and to introduce the t-distribution.

The goals are: 1) to review the basic concepts studied in the introductory topic Confidence Intervals, and 2) to use this to segue into the confidence interval in cases where Image:Sigma.png is not known.

Confidence Intervals for a Normal Population

You should remember from The Basics of Confidence Intervals that if a random variable is normally distributed with known mean, Image:Mu_for_poisson.png, and standard deviation, Image:Sigma.png, then a confidence interval is given by the following equation:

 CI = [ Image:Mu_for_poisson.png - zImage:Sigma.png, Image:Mu_for_poisson.png + zImage:Sigma.png ]

where z can be found by looking in the standard normal table for a given confidence interval, e.g. for 90%, we would look up the value 0.50 - 0.45 = 0.05, to get z = 1.645.

Here the margin of error is:

E = zImage:Sigma.png

This is for the case of a population with known normal parameters, as shown conceptually below:

Image:Confidence_Intervals_6.png

Confidence Intervals for a Mean - Population Standard Deviation Known

You should remember from Confidence Intervals for the Mean of a Normal Population with Known Standard Deviation that if the population average Image:Mu_for_poisson.png is unknown, then we can approximate it using the sample mean, Image:Xbar.png, taken from a sample of size n. This is shown conceptually below:

Image:Confidence_Intervals_7.png

If we do this, the natural question is: how precisely does Image:Xbar.png represent Image:Mu_for_poisson.png?

We answer this with a confidence interval for the mean for some confidence level of interest:

CI = [ Image:Xbar.png - zImage:Sigma_xbar.pngImage:Xbar.png + zImage:Sigma_xbar.png ]

which is the same as:

 CI = [ Image:Xbar.png - z (Image:Sigma_over_root_n.png),  Image:Xbar.png + z (Image:Sigma_over_root_n.png)

Here, the margin of error is:

E = z (Image:Sigma_over_root_n.png)

Confidence Intervals for a Mean - Population Standard Deviation Unknown

What do you do if both Image:Mu_for_poisson.png and Image:Sigma.png are unknown as shown below? How do you represent the precision with which Image:Xbar.png represents Image:Mu_for_poisson.png in this case?

Image:Confidence_Intervals_8.png

Many students reply to this by suggesting that we use the sample standard deviation, s, and just substitute it into the above CI equation in place of Image:Sigma.png.

This almost works, but not quite.

As long as the original population is normally distributed, the actual equation for the confidence interval is as follows:

 CI = [ Image:Xbar.png - t (Image:S_over_root_n.png), Image:Xbar.png + t (Image:S_over_root_n.png) ]

As you can see, the difference is the "t" instead of "z". This is because the random variable Image:Xbar_large.png is no longer normally distributed (z). Because we are using the sample standard deviation, s, it is now distributed according to the so-called Student's t Distribution. For short, we refer to this as just the t distribution.

We'll take a closer look at this distribution in the next section.

In this case, the margin of error is:

E = t (Image:S_over_root_n.png)

Topic Notes: The t Distribution

These topic notes are intended to facilitate a discussion of Section 7.2 of Bowerman et al. and Section 7.4 of Kvanli et al.

What is the t distribution?

The t distribution is symmetric about zero and looks a lot like the normal distribution.

In fact, the main difference is that the shape of the t distribution depends on the size of the sample through a parameter called degrees of freedom:

degrees of freedom = df = n - 1

where n is the sample size.

The difference between the t and normal distributions is shown below for 20 degrees of freedom, i.e. for a sample size of 21:

Image:Confidence_Intervals_9.png

Notice that the t distribution is shorter and a little bit wider at the tails.

As the sample size increases, the t-distribution tends toward the standard normal distribution. In fact, for n > 30, there is little difference between the two, as shown below:

Image:Confidence_Intervals_10.png

When to use the t distribution

If Image:Sigma.png is unknown:

  • then you should approximate it by the sample standard deviation, s
  • and you should use the t distribution

If Image:Sigma.png is known:

  • then you should use the normal distribution

Note however that if n > 30:

  • the normal distribution can be used to approximate the t distribution
  • even when Image:Sigma.png is unknown

Using the t tables

As there was for the standard normal distribution, there is a table for the t distribution. In Kvanli et al., the t table is given as Table A.5.

The t table is very simple to use and allows you to look up the t value for:

  • the known degrees of freedom corresponding to the problem
  • a given area, or probability, as shown below

Image:Confidence_Intervals_11.png

Notice that some t tables label the shaded area Image:Alpha.png and label the values of tImage:Alpha.png. Don't let this throw you: the right-tail area is the same whatever the label. Have a look at this in the next example.

Example 1

What is the t-value for n=12 and Image:Alpha.png/2 = 0.05?


Example 2

If n=12 again, what is the t-value corresponding to a confidence level of 80%?


Example 3

What are the right-tail areas for each of the following confidence levels?

a) 80%

b) 90%

c) 95%

d) 98%

e) 99%


Summary

The following table summarizes some of our findings here:


Confidence level
Associated right-side area
(Image:Alpha.png/2) for the t table
80%
0.100
90%
0.050
95% 0.025
98% 0.010
99% 0.005


Tying it all Together

Example 4

What is the 95% confidence interval for the mean in a situation where Image:Mu_for_poisson.png and Image:Sigma.png are both unknown, and where n = 21, s = 12, Image:Xbar.png = 96?


Topic Notes: Selecting the Necessary Sample Size

These notes are intended to facilitate a discussion of Section 7.3 of Bowerman et al. and Section 7.5 of Kvanli et al.

For known population standard deviation

Most of the problems you will encounter will be for the case where Image:Sigma.png is known.

In this case, the required sample size, n, is given by the following equation:

Image:Sample_size_equation.png

where z is the z-score value corresponding to a given level of confidence (recall the summary of these), Image:Sigma.png is the population standard deviation, and E is some desired margin of error.

For unknown population standard deviation

If Image:Sigma.png is not known, then you can replace it with s as long as the resulting sample size is greater than 30:

Image:Sample_size_equation_2.png

Example 5

For a population that is normally distributed with Image:Sigma.png = 12, what sample size is required to achieve a margin of error, E=3, if the desired confidence level is 95%?


Practice Problems

Practice Problem 1 (OE)

Students in a first year psychology class are required to maintain an average of 55% to pass. To determine whether this average is being maintained, the instructor took a random sample of 15 students and found the average to be 52% and the standard devation to be 12%.

a) What is the margin of error for the true average grade for a confidence level of 90%?


b) What is the 90% confidence interval for the true average grade in the above case?


Practice Problem 2 (OE)

If the same instructor wants to redo his survey and ensure a margin of error no larger than 3, what sample size would you recommend he uses if he wants to maintain the confidence level of 90%?


Personal tools