29 July 2008

Choose Me

We support a number of different question types in our surveys. As a questionnaire designer, you can ask a question that has to be answered with a number, or with a date. Or you can ask for a long free-text answer: EFM Community calls this an "essay question." But the predominant question types are two: Choose One and Choose Many. (Older versions of EFM Feedback called these Single Select and Multiple Select.) These questions can be asked either one-by-one or grouped into tables or matrices. Maybe 80% of the questions asked in deployed surveys are either Choose One or Choose Many.

A Choose One question is usually rendered with radio buttons, though sometimes designers use a dropdown.

What is your favorite color?

Red Orange Yellow Green Blue


While a Choose Many (a/k/a Choose All That Apply) uses checkboxes.

What Metro line(s) do you ride regularly?

Red OrangeYellow Green Blue


These two question types look to be so similar, and yet there are subtle differences in the way the data for them is collected and analyzed, and I've seen more than one analyst stumble over the differences.

Let's say that you're architecting an application that lets people design and administer surveys. What are the pitfalls? Here I'm focusing on how to model the actual survey responses that are collected, rather than how to model the metadata that constitutes the survey design.

A single database attribute is sufficient to store the respondents' answers to a Choose One. For our example question above, assign the values 1 through 5 to the five color choices. Then you can store the survey respondents' answers in a relational database column typed as an integer of some suitable size, 8 or 16 bits, perhaps. It depends on how many possible choices you want to provide for in your survey software app.

On the other hand, for a Choose Many, you need as many database attributes as there are possible choices. One way to do it is to use one boolean database column for each choice: five columns in our example above. Or you could pack the answers into a bitfield, although querying and filtering on individual choices would then be a problem. Whatever path you take as the software architect, the key point is that the respondents' answers will be represented differently than for Choose One questions.

Generally, Choose Ones are just easier to deal with—a generic design is more accommodating. This fact probably explains why almost all of the free online polls that you see (hosted by outfits like polldaddy) are Choose Ones.

If your survey app allows designers to change a survey's design by adding more choices after the survey has been released to the world, you can see that adding a choice to a Choose One (just another possible value to be stored) is a lot simpler than adding a choice to a Choose Many (maybe a new database column).

What is your favorite color?

Red Orange Yellow Green Blue Indigo Violet

What Metro line(s) do you ride regularly?

Red Orange Yellow Green Blue Silver Purple


Even trickier is enabling a survey designer to rewrite a Choose One as a Choose Many, or vice versa.

What are your favorite color(s)?

Red Orange Yellow Green Blue

What Metro line do you ride most often?

Red Orange Yellow Green Blue


In this case, you're not simply extending a relational schema by adding columns or possible values, but rather you've got to convert one column type to another.

Reporting and analyzing survey responses for the two question types is likewise different.

For Choose One questions, you can do a frequency analysis that adds up to 100%:

What is your favorite color? (N of respondents=100)

  • 45% Red
  • 12% Orange
  • 10% Yellow
  • 17% Green
  • 16% Blue


And if your corresponding numerical scale makes sense, you can compute means and other statistics. (Granted, this makes more sense for computing something like average customer satisfaction than for computing an average favorite color.) Analysis like this can be presented effectively with a pie chart.

But for Choose Many questions, the numbers never add up.

What Metro line(s) do you ride regularly? (N of respondents=100)

  • 45% Red
  • 22% Orange
  • 65% Yellow
  • 22% Green
  • 40% Blue


An unstacked bar chart is probably the best way to present this information graphically. At least you can depend on the longest/tallest bar being no more than 100%. Also notice that you don't have a numerical scale, just booleans, so you can't compute means and standard deviations for Choose Many questions.

I haven't delved into some of the more fiddly bits of designing a survey app that supports Choose One and Choose Many questions. Things like support for "none of the above," "I don't know," "not applicable to me," or "I'd rather not say" choices, or requiring that the respondent check at least one box or radio button, or at most three checkboxes.

No comments: