Topic:Introduction to Probability

From SharedExperienceProject

Jump to: navigation, search

Contents

Topic Highlights

(What you will learn)

  • Basics of experiments, events and probabilities
  • Different definitions of probabilities
  • Basics of contingency tables, and the relationship of probabilities to frequency distributions
  • Different types of probabilities, including marginal, joint and conditional
  • Why these are important and how to create and interpret them

Introduction and Motivation

(Why learn it)

Do you remember when we first discussed why statistics are important in business? (Recall the topic Introduction to Business Statistics if not.) We talked about descriptive statistics as the process of representing the real world based on a model and some data we might collect. We then talked about inferential statistics as the process of making decisions based on that representation.

Well, probabilities lie somewhere in between these processes. When using probabilities, we are measuring the chance that an outcome will occur, given the data we have collected and described.

This is absolutely critical for a business manager wanting to make the best possible decision - what are the chances of possible outcomes?

Learning Activities

(How the levels of understanding will be gained)

Learning activities for this topic
Type Name Direction
Reading Self-directed
In-class worksheet
In-class discussion
Instructor-directed
Practice problems
Self-directed
Personal activities Self-directed

Learning Objectives

(Levels of understanding to be gained)

Learning objectives for this topic
Level of Understanding Objective(s)
Very best
Highly satisfactory
  • Can I solve Practice Problems 4, and 5?
Satisfactory
  • Can I solve Practice Problems 1, 2 and 3?
  • Am I familiar with the equations we used in this topic? Are they on my formula sheet?
  • Can I compute joint and conditional probabilities from a contingency table?
Maybe just enough to pass

Lecture Notes: The Probability Concept

These notes are intended to facilitate a discussion of the basic concept of a probability.

What is probability?

Let's start with some definitions that will allow us to define the concept of probability.

Experiment

An experiment is an activity for which the outcome is uncertain. We carry out experiments to collect data from which we can learn about the variable that is of interest to us. Experiments we have already discussed include the following:

  • Filling in the Who's in the Room? Info_circle.png survey we looked at in the topic Variables in Data Sets
  • Sending out a survey to collect data about a target market
  • Rolling two dice to learn about the odds related to the sum of two dice
  • Taking an exam

The term repeated experiment just implies carrying it out more than once:

  • Recall that collecting more data allows us to better represent the parameters of a population using the statistics derived from a sample

Outcome

An outcome is a result of the experiment.

Building on the dice example, possible outcomes include rolling:

  • A 1 and a 1, a 1 and a 2,  a 1 and a 3,  a 1 and a 4,  a 1 and a 5,  a 1 and a 6, a 2 and a 1, and so on ...
  • We would write these outcomes as: (1,1), (1,2), (1,3), (1,4), (1,5), (1,6), (2,1), ...
  • This is an example of a discrete distribution

Building on the example of a student taking an exam:

  • Possible outcomes include any score between 0 and 100.0%
  • This is an example of a continuous distribution

The set of all possible outcomes for an experiment is known as the sample space.

Event

When speaking of probabilities, an event consists of one or more possible outcomes of the experiment. Events usually correspond to variables of interest.

For example:

  • A gambler might ask what the probability is of rolling such that the sum of the two dice is 4 - this is the event and would be the result of all of the following outcomes: (1,3), (2,2) and (3,1)
  • She might also ask what the probability is of rolling such that she rolls the highest combined score possible - this is the event and would be the result of the following outcome: (6,6)

For example:

  • Events of interest for a student taking an exam include passing and failing - passing might be be the result of any outcome equal to or above 50.0% and failing would then be the result of any outcome below 50.0%

Probability

Generally, when speaking of probabilities in business settings, we are seeking the probability of an event occuring. For an event A, we would represent this as P(A).

Building on the definitions given above, we can now define probability as the relative frequency with which an event occurs over time.

We can represent this as:

Image:Probability_equation_1.png

Let's look at two ways of getting a probability: 1) from all possible outcomes, and 2) from observation.

Probability from all possible outcomes

If you know all of the possible outcomes of an experiment, then it is possible to find the probability of an event A as follows:

Image:Probability_equation_2.png

where n is the number of possible outcomes and m is the number of times the event occurs.

Let's take in some examples.

Example 1

If you roll a six-sided dice, what is the probability of rolling a 3?


Example 2

If you flip two coins:

a) what are the possible outcomes?

b) what is the probability of getting two heads?

c) of getting a head and a tail in any order?


Example 3

If you roll two four-sided dice:

a) what is the sample space?

b) what is the probability of getting a sum of 4?


Probability from observations

What if you don't know all possible outcomes (as we did in the example above)?

It's actually fairly simple. All we can do is observe the variable of interest many times and assume that the observed behavior will apply in the future.

Here:

Image:Probability_equation_2.png

where n is the number of observations you make, and m is the number of occurrences of the event A.

Example 4

Suppose you conduct a random survey of 1000 teenage girls, in an effort to understand the target market for a fancy new running shoe. If the household incomes of 310 of those girls is greater than $42,000, then what is your best guess of:

a) The probability of the household income of teenage girls' homes across Canada being greater than $42,000?

b) The probability of the same group having a household income less than $42,000?


Complement

The above example gives rise to the concept of a complement. For independent events, you should understand that the complement of event A is the event that A does not occur.

For example, if you survey 200 people and 105 are male, then 95 must be female:

P(male) = 1.0 - P(female)

You can check this:

105/200 = 1.0 - 95/200 which is true.

Lecture Notes: Different Types of Probabilities

The following notes are intended to facilitate a discussion about the three different types of probabilities: marginal, joint and conditional.

Marginal Probabilities

A marginal probability is the probability of any one single event and can be determined by merely counting, i.e. from a frequency distribution table. Let's have a look at an example.

Example 5

Let's pick up with our marketing manager - the one who is carrying out a survey of teen girls to try and assess the potential market for a fancy new running shoe.

  • So far, she's collected data from 200 respondents
  • She's quite keen to analyze this initial data and learn as much as possible so she can impress her boss at their weekly meeting 
  • This is a description of the data she has (notice her very fine use of frequency distribution tables):
Image:Contingency_table_example_-_raw_data.PNG

a) From these tables, can you compute the probability of a respondent:

  • Being interested in the shoes, P(I)?
  • Not being interested in the shoes, P(N)?
  • Being under 18 years of age, P(U)?
  • Being older than 18 years of age, P(O)?


b) From the same tables, can you compute the probability of respondents:

  • Being interested in the shoes and being over 18?
  • Being interested in the shoes if they are over 18?


Contingency Tables

We just got a glimpse in part b of Example 5 of the concepts of joint and conditional probabilities. Let's take a deeper look by considering a new kind of table.

The following is an empty contingency table. You should notice that the data from the simple frequency distribution tables is placed at the edges of the table:

Image:Contingency_table_example_-_empty.PNG

Now imagine that our marketing manager had not only counted the total number in each class (I, N, O and U) as she reviewed the cards - imagine that she had also counted the number of respondents under 18 that had indicated that they were interested in the running shoe.

Example 6

If 84 of the 140 respondents under 18 (U) said they were interested (I), then:

a) How many of the respondents under 18 must have said they were not interested (N)?


b) How would you place the values 84 and 56 on the contingency table?


c) Can you fill in the rest of the contingency table?


Joint and Conditional Probability

Now, with completed contingency table in hand, you should be able to solve the earlier question from part b) of Example 5.

  • Being interested in the shoes and being over 18?
  • Being interested in the shoes if they are over 18?

Let's try these in the following examples...

Example 7

If you are given the completed contingency table:

a) How many respondents are interested in the shoes and over 18?

b) What is the probability of a respondent being interested in the shoes and over 18?

Image:Contingency_table_example_-_joint_probs.PNG


Example 8

Also using the contingency table:

a) How many respondents are over 18 (O)?

b) How many of those are interested in the shoe (I)?

c) So, what is the probability of a respondent being interested (I) in the shoes if they are over 18 (O)?


We'll see more on the concepts of joint and conditional probabilities in the next topic. For now, you should recognize the difference between the two, and notice the way we write them:

Joint:
  • P( I and O)
  • Usually referred to as the probability of I and O
Conditional:
  • P( I | O )
  • Usually referred to as the probability of I given O

Practice Problems

We've covered the basics, now build your skills with the following problems. Don't look at the solutions until you've worked the problem through.

Practice Problem 1

In Example 3, which event occurs more often than any other if you are summing the two rolls?


Practice Problem 2

In Example 6, what would you expect the probability to be of future respondents who are under 18, given that they are interested in the shoes?


Practice Problem 3 (updated)

Imagine the company discussed in Example 6 is interested in what secondary sales they could make to people over 18. What is the probability that a potential customer will be over 18 and interested in the shoes?


Practice Problem 4 (updated)

Same company and data as above. If you survey another 1000 people, how many of those over 18 would you expect not to be interested in the shoes?


Practice Problem 5

Same company and data again. If the respondents are not interested, what is the probability that they are not under 18?

Footnote

  1. We will tackle the most difficult problems in a later topic - this one is just to get the basics down.
Personal tools