Topic:Coding Research Surveys in Excel
From SharedExperienceProject
Contents |
Topic Highlights
(What you will learn)
- How to use Excel to code the data obtained in a marketing research survey
- How to count the frequency of each response type
Introduction and Motivation
(Why learn it)
Descriptive marketing research methods most often require the collection and analysis of quantitative data using primary methods such as surveys. After learning how to design and administer a survey instrument, it is critical that you learn how to code and analyze the results. This topic introduces you to how Excel can be used to code surveys. Although most marketing research organizations often prefer to code and analyze survey data using a statistical analysis package like SPSS, Excel can play one of several important roles:
- It is a useful learning tool for first-time researchers - if you can code data in Excel, you will be able to code it in a package like SPSS
- It allows the researcher to take a first look at his or her data, e.g. while still in the field
- Excel is universal and so provides a useful tool for communicating with the client (who may not have access to a package like SPSS)
Under Final Note and Samples (below in this page), you can download the sample Excel file used in this tutorial. You can also work through the tutorial yourself and create it from scratch.
Learning Activities
(How the levels of understanding will be gained)
| Type | Name | Direction |
| Reading |
| Self-directed |
| Laboratory |
| Instructor-directed |
| Personal activities |
| Self-directed |
Topic Notes: Entering and Coding the Data
These topic notes provide an introduction to entering and coding survey data in Excel. We'll use a simple sample survey as an example, and to get familiar with the basic steps, we'll start by looking at small sample of the entire data set.
If you need a refresher on variable types, including dependent and independent, and categorical and numerical, we recommend you check out these sites: Variables and Variables in Data Sets.
Sample Survey
Let's assume that you've administered a survey like this one to everyone in your class:
And, to keep things simple for the purposes of this topic, let's only code the data for the following open-ended questions:
- Question 2: Gender
- Question 3: Age
- Question 4: Eye color
Sample Data
Let's further assume that you have the following data for these questions for the first four respondents. The goal of this exercise is to learn how to enter and code this data.
Setting up the spreadsheet
The first thing you need to do is create a new spreadsheet into which you can enter the data.
1. Open the spreadsheet:
- From the taskbar, choose Start > All Programs > Microsoft Office > Microsoft Office Excel 2003
- A window will appear with an empty worksheet
2. Save the new worksheet:
- Select File > Save As from the menu bar
- Choose an appropriate location for the file on your computer's hard drive or on your network drive
- Enter a suitable filename, such as "A Survey of My Classmates.xls"
- Press the button marked Save
3. Enter the column headings, so your worksheet looks like the one below. You will need to adjust the widths of the columns to make it look like this one. You may also wish to rename the current sheet to "Survey Data", as has been done below.
| Setting up the Spreadsheet |
|---|
| |
Entering the data
Next you need to enter the sample data given above. This is done and shown below.
Notice the following:
- It is common practice to assign an ID# to each respondent's survey. By writing this ID# on the survey itself and in the spreadsheet, you can always go back to the original survey instruments - even if they get out of order in the future.
- For categorical variables, we write the full name corresponding to the data types, e.g. "male" and "green"
| Entering the data |
|---|
| |
You've entered the data. Now, you're ready to code the categorical variables.
Coding Specification
In order to code the data, you need to specify what numerical values will correspond to each category. This is done using something known as a coding specification, which is nothing more than a table specifying which numerical code you want to have correspond to each category.
For example, the coding specification for gender might be as follows:
In other words, we want responses of "male" to be represented by the number 1, and responses of "female" to be represented by the number 2.
The same thing might be done as follows for eye color:
The next step is to put these coding specification tables into your Excel worksheet. This is done in the example below[1].
| Entering the data |
|---|
| |
Coding the Data
Now you're ready to code the data. After setting up a table, the coding is done in two steps: 1) Sorting the coding specification tables, and 2) Doing a table lookup. Let's have a look, starting by setting up the table.
Setting up the table
Start by putting in a new heading row, below which the coded data will be placed[1]. This is done below under the title "Coded Survey Data":
| Setting up the coded data table |
|---|
| |
Now you can copy the ID# and Age data to the new table. These are numerical variables and so do not need to be coded:
| Copying the numerical data |
|---|
| |
Finally you're ready to code the categorical data. Let's start with the gender data. As you know from our earlier discussion of the coding specification, the goal in coding the gender data is to assign numerical codes to the categorical values "male" and "female". We do this in two steps.
Step 1: Sorting the coding specification table
Excel requires the coding specification table to be sorted alphabetically. Do this by:
1) Selecting the data in that table
2) Using the menu to select Data > Sort
3) Sorting the data in Ascending order by Value
These steps are shown below for the Gender specification.
You should repeat them for the Eye color specification.
| Sorting the Gender coding specification 1 |
|---|
| |
| Sorting the Gender coding specification 2 |
|---|
| |
The result is shown below for Gender and Eye color:
Now you're ready for the final step:
Step 2: Doing a table lookup
In this step, you need to use the Excel LOOKUP function to fill in the data for the Gender cells (B12:B15) and for the Eye color cells (D12:D15).
There are two ways of doing this. The first is to specify the LOOKUP function manually in the cells. For example for the first row in the Gender column (cell B12), enter:
=LOOKUP(B4,F5:F6,G5:G6)
where:
cells F5:F6 are the left side of the Gender coding specification table
cells G5:G6 are the right side
This results in the following:
| Using the LOOKUP function 1 |
|---|
| |
The second way to do a table lookup is to use the menus. Use the Insert Function button, fx, to locate the LOOKUP function:
and then fill in the appropriate values as follows.
| Using the LOOKUP function 2 |
|---|
| |
You should then repeat these steps for each point under Gender and Eye color to obtain the coded values shown below.
You can also use absolute referencing to allow you to copy the contents of the first cell to all the others. To do this for cell B12, for example, highlight the text F5:F6,G5:G6 and then press F4. After doing this, cell B12 would like this:
=LOOKUP(B4,$F$5:$F$6,$G$5:$G$6)
Your final coded values table should look as follows:
| Coded values |
|---|
| |
Topic Notes: Obtaining Frequency Distributions
Frequency Distributions
Once the data have been coded, it is often necessary to come up with the frequency distribution. In other words, we want to know how many respondents there were for each data type.
This is done using Excel's COUNTIF() function as shown here[1].
The frequency of "male" responses is found as follows:
| Frequency of "male" responses |
|---|
| |
The frequency of "female" responses is found like this:
| Frequency of "female" responses |
|---|
| |
You should do the same thing for the Eye color data to obtain the following:
Relative Frequency Distributions
Next, you are encouraged to compute the relative frequency distribution for your data.
You can do this as follows.
First, compute the total number of points in your frequency distribution by using the Excel SUM() function. This is shown below. Notice that
| Computing the total number |
|---|
| |
The result should look like this:
Next, you should compute the relative frequencies as follows, where the absolute referencing has been used again:
| Relative frequency column |
|---|
| |
Once you do the same thing for each question, add a column title, and change the data to percentages, the resulting table should like this:
Final Note and Samples
You can download the spreadsheet used in this example here:
A Survey of My Classmates - Same Sheet.xls
It was noted before[1] that this example took some liberties with best practices because all of the tables in the example were included on the same worksheet in order to present them to you as easily as possible.
In practice the following three separate sheets are recommended:
1. The survey data
2. The coding specification
3. The coded data
A sample where this is done can be downloaded here:
A Survey of My Classmates.xls

