Topic:Coding Research Surveys in Excel

From SharedExperienceProject

Jump to: navigation, search

Contents

Topic Highlights

(What you will learn)

  • How to use Excel to code the data obtained in a marketing research survey
  • How to count the frequency of each response type

Introduction and Motivation

(Why learn it)

Descriptive marketing research methods most often require the collection and analysis of quantitative data using primary methods such as surveys. After learning how to design and administer a survey instrument, it is critical that you learn how to code and analyze the results. This topic introduces you to how Excel can be used to code surveys. Although most marketing research organizations often prefer to code and analyze survey data using a statistical analysis package like SPSS, Excel can play one of several important roles:

  • It is a useful learning tool for first-time researchers - if you can code data in Excel, you will be able to code it in a package like SPSS
  • It allows the researcher to take a first look at his or her data, e.g. while still in the field
  • Excel is universal and so provides a useful tool for communicating with the client (who may not have access to a package like SPSS)

Under Final Note and Samples (below in this page), you can download the sample Excel file used in this tutorial. You can also work through the tutorial yourself and create it from scratch.

Learning Activities

(How the levels of understanding will be gained)

Learning activities for this topic
Type Name Direction
Reading
Self-directed
Laboratory
  • This topic usually accompanies a laboratory session where the steps in the topic notes can be attempted and feedback can be obtained
  • You can download the sample Excel files below under Final Note and Samples
Instructor-directed
Personal activities
  • e.g. reflection and review
Self-directed

Topic Notes: Entering and Coding the Data

These topic notes provide an introduction to entering and coding survey data in Excel. We'll use a simple sample survey as an example, and to get familiar with the basic steps, we'll start by looking at small sample of the entire data set.

If you need a refresher on variable types, including dependent and independent, and categorical and numerical, we recommend you check out these sites: Variables and Variables in Data Sets.

Sample Survey

Let's assume that you've administered a survey like this one to everyone in your class:

And, to keep things simple for the purposes of this topic, let's only code the data for the following open-ended questions:

  • Question 2: Gender
  • Question 3: Age
  • Question 4: Eye color

Sample Data

Let's further assume that you have the following data for these questions for the first four respondents. The goal of this exercise is to learn how to enter and code this data.

Image:Coding_in_Excel_1.PNG

Setting up the spreadsheet

The first thing you need to do is create a new spreadsheet into which you can enter the data.

1. Open the spreadsheet:

  • From the taskbar, choose Start > All Programs > Microsoft Office > Microsoft Office Excel 2003
  • A window will appear with an empty worksheet

2. Save the new worksheet:

  • Select File > Save As from the menu bar
  • Choose an appropriate location for the file on your computer's hard drive or on your network drive
  • Enter a suitable filename, such as "A Survey of My Classmates.xls"
  • Press the button marked Save

3. Enter the column headings, so your worksheet looks like the one below. You will need to adjust the widths of the columns to make it look like this one. You may also wish to rename the current sheet to "Survey Data", as has been done below.


Entering the data

Next you need to enter the sample data given above. This is done and shown below.

Notice the following:

  • It is common practice to assign an ID# to each respondent's survey. By writing this ID# on the survey itself and in the spreadsheet, you can always go back to the original survey instruments - even if they get out of order in the future.
  • For categorical variables, we write the full name corresponding to the data types, e.g. "male" and "green"


You've entered the data. Now, you're ready to code the categorical variables.

Coding Specification

In order to code the data, you need to specify what numerical values will correspond to each category. This is done using something known as a coding specification, which is nothing more than a table specifying which numerical code you want to have correspond to each category.

For example, the coding specification for gender might be as follows:

Image:Coding_in_Excel_5.PNG

In other words, we want responses of "male" to be represented by the number 1, and responses of "female" to be represented by the number 2.

The same thing might be done as follows for eye color:

Image:Coding_in_Excel_6.PNG

The next step is to put these coding specification tables into your Excel worksheet. This is done in the example below[1].


Coding the Data

Now you're ready to code the data. After setting up a table, the coding is done in two steps: 1) Sorting the coding specification tables, and 2) Doing a table lookup. Let's have a look, starting by setting up the table.

Setting up the table

Start by putting in a new heading row, below which the coded data will be placed[1]. This is done below under the title "Coded Survey Data":


Now you can copy the ID# and Age data to the new table. These are numerical variables and so do not need to be coded:


Finally you're ready to code the categorical data. Let's start with the gender data. As you know from our earlier discussion of the coding specification, the goal in coding the gender data is to assign numerical codes to the categorical values "male" and "female". We do this in two steps.

Step 1: Sorting the coding specification table

Excel requires the coding specification table to be sorted alphabetically. Do this by:

1) Selecting the data in that table
2) Using the menu to select Data > Sort
3) Sorting the data in Ascending order by Value

These steps are shown below for the Gender specification.

You should repeat them for the Eye color specification.


The result is shown below for Gender and Eye color:

Image:Coding_in_Excel_11.PNG


Now you're ready for the final step:

Step 2: Doing a table lookup

In this step, you need to use the Excel LOOKUP function to fill in the data for the Gender cells (B12:B15) and for the Eye color cells (D12:D15).

There are two ways of doing this. The first is to specify the LOOKUP function manually in the cells. For example for the first row in the Gender column (cell B12), enter:

=LOOKUP(B4,F5:F6,G5:G6)

where:

cells F5:F6 are the left side of the Gender coding specification table
cells G5:G6 are the right side

This results in the following:


The second way to do a table lookup is to use the menus. Use the Insert Function button, fx, to locate the LOOKUP function:

Image:Coding_in_Excel_14.PNG

and then fill in the appropriate values as follows.


You should then repeat these steps for each point under Gender and Eye color to obtain the coded values shown below.

You can also use absolute referencing to allow you to copy the contents of the first cell to all the others. To do this for cell B12, for example, highlight the text F5:F6,G5:G6 and then press F4. After doing this, cell B12 would like this:

=LOOKUP(B4,$F$5:$F$6,$G$5:$G$6)

Your final coded values table should look as follows:



Topic Notes: Obtaining Frequency Distributions

Frequency Distributions

Once the data have been coded, it is often necessary to come up with the frequency distribution. In other words, we want to know how many respondents there were for each data type.

This is done using Excel's COUNTIF() function as shown here[1].

The frequency of "male" responses is found as follows:


The frequency of "female" responses is found like this:


You should do the same thing for the Eye color data to obtain the following:

Image:Coding_in_Excel_18.PNG

Relative Frequency Distributions

Next, you are encouraged to compute the relative frequency distribution for your data.

You can do this as follows.

First, compute the total number of points in your frequency distribution by using the Excel SUM() function. This is shown below. Notice that


The result should look like this:

Image:Coding_in_Excel_19a.PNG

Next, you should compute the relative frequencies as follows, where the absolute referencing has been used again:


Once you do the same thing for each question, add a column title, and change the data to percentages, the resulting table should like this:

Image:Coding_in_Excel_21.PNG


Final Note and Samples

You can download the spreadsheet used in this example here:

A Survey of My Classmates - Same Sheet.xls Info_circle.png

It was noted before[1] that this example took some liberties with best practices because all of the tables in the example were included on the same worksheet in order to present them to you as easily as possible.

In practice the following three separate sheets are recommended:

1. The survey data
2. The coding specification
3. The coded data

A sample where this is done can be downloaded here:

A Survey of My Classmates.xls Info_circle.png

References

  1. 1.0 1.1 1.2 1.3 The coding specification tables and the coded data themselves have been placed on the same sheet as the data in order to keep things simple for this example. We will look later at how best practices dictate putting these on separate sheets.
Personal tools