/qualitative-interviews-maxdiff-survey-testing

June 10, 2026

How qualitative interviews before a MaxDiff survey saved us from a very expensive mistake — and why they should be a standard part of how you design a MaxDiff.

InterQ Research · June 2026 · 7 min read

We were deep in one of the most high-priority projects we’d taken on in years. The MaxDiff survey had been through multiple rounds of internal review. The feature list was tight. The design felt polished. We knew this methodology inside and out.

And then we sat down with our first two customers and watched them take it.

By the end of interview number two, it was pretty clear: we had a problem. And had we skipped that qualitative testing step — had we just hit launch on that initial survey — we would have collected a lot of data that didn’t actually reflect what customers believed. We would have been reading confusion, not preference.

This is the story of why qualitative interviews are a non-negotiable step when you’re designing a MaxDiff survey, and why the format itself makes qual testing more critical than with almost any other survey type.

First: What is a MaxDiff survey, and why does it matter?

MaxDiff (Maximum Difference Scaling, sometimes called Best-Worst Scaling) is one of the most powerful tools in the quantitative research toolkit for measuring feature or attribute importance. Unlike a standard rating scale — where everyone tends to rate everything a 4 or 5 — MaxDiff forces real trade-offs. Respondents have to choose, and those choices reveal genuine priority differences that traditional Likert scales simply can’t surface.

For product teams, go-to-market strategy, and feature prioritization, a well-designed MaxDiff study can be genuinely revelatory. It’s the methodology that tells you not just what customers like, but what they value most when they can’t have everything.

How it works

The MaxDiff format, explained

A MaxDiff survey presents respondents with a series of sets, each containing a subset of the full feature or attribute list. In each set, respondents select the most important item and the least important item.

1. A respondent sees a set of 4–6 features at a time — not the full list.

2. They choose which is most important and which is least important to them.

3. This repeats across many sets, with features rotating in and out each time.

4. The same features reappear across multiple sets, building a robust picture of relative preference through repeated comparison.

The statistical power here is real: because each attribute is seen multiple times in different competitive contexts, you end up with a highly reliable importance score for every item on the list.

That rotation mechanic is also the source of MaxDiff’s biggest usability challenge — and the core reason why qualitative pre-testing matters so much.

Why MaxDiff is especially confusing for participants

Here’s what the experience actually feels like from the respondent’s seat: you’re asked to evaluate the same features over and over, in shifting combinations, making forced choices each time. New items appear as you go. Items you just ranked as “least important” might show up again in a new set context — and now you have to reconsider them against a different group of competitors.

For researchers, this makes total sense. We know what the algorithm is doing. We understand that the repeated exposure is the whole point. But for someone who has never seen this format before? It can feel genuinely disorienting — even arbitrary.

“Why do I keep seeing the same things?” “Am I supposed to change my answer?” “I don’t understand what they’re asking.” These are real responses we’ve heard in interview sessions. And when a respondent is confused, they don’t always abandon the survey (though some do). More often, they just answer. They click something. And that something doesn’t reflect their actual preferences — it reflects their attempt to navigate a format they don’t understand.

We had all been working on this project for so long that we were unable to see how myopic we’d become. What seemed obvious to us definitely wasn’t for customers.

That’s the silent killer of a MaxDiff dataset. Not the people who drop off — you can see that in your completion rates. It’s the people who push through and give you answers that look like data but are actually noise. And because MaxDiff data looks clean and quantitative by nature, it’s easy to trust it without questioning whether respondents actually understood what they were doing.

What happened when we tested ours

On this particular project, we made the call to test the MaxDiff survey qualitatively with seven customers before launch. We had them take the survey while sharing their screen, and we asked questions as they went — probing in real time on what they were thinking, what felt confusing, and how they were interpreting each set.

It was pretty apparent after just two interviews how confusing our design was.

The issues weren’t ones we could have caught in an internal review. They were rooted in assumptions we’d stopped being able to see — the curse of expertise. We’d been living inside this feature set for months. The terminology felt self-evident to us. The logic of the question framing made complete sense from where we were sitting.

From where customers were sitting, it was a different experience.

So we iterated. After each interview, we took the feedback and refined the design — the instructions, the feature wording, the set size, the framing. Each subsequent interview got more useful as the survey got cleaner. By the time we hit our sixth and seventh sessions, respondents were sailing through. The confusion had been designed out. And what we launched was something we could actually trust.

Why this is how to design a MaxDiff survey — not just test it

The instinct in quantitative research is to think of qual testing as a “nice to have” — a belt-and-suspenders step you skip when you’re confident in your design or pressed for time. We’d argue that framing gets it exactly backwards, especially for MaxDiff.

A standard rating scale survey has an inherent interpretive safety net. If someone misreads a question, they’ll still likely give you a directionally valid answer, because the format is familiar and forgiving. MaxDiff doesn’t have that safety net. The format is unusual, the repeated-comparison mechanic is counterintuitive, and the forced-choice structure means there’s no neutral ground to retreat to when a respondent is unsure what’s being asked.

In a MaxDiff, confusion doesn’t produce missing data. It produces bad data that looks like good data.

Qualitative interviews before launch are the only reliable way to find that out before it’s too late. And the format of the qual testing matters too — having respondents share their screen and think aloud while taking the survey reveals a class of comprehension problems that focus groups or post-survey debriefs simply don’t surface. You need to see people in the moment of confusion, not after they’ve rationalized their way through it.

The practical case for qual-first survey design

If you’re scoping a MaxDiff project, here’s how we think about building the qual testing step in — not as an add-on, but as part of the methodology itself:

Plan for 5–8 cognitive interview sessions. That’s usually enough to surface the major comprehension issues and see the curve of improvement. You don’t need 20. You need enough to iterate meaningfully.

Use screen sharing with concurrent think-aloud. Ask respondents to narrate their experience as they go. “What are you thinking about right now?” is often more illuminating than any follow-up probe you could design in advance.

Build iteration time into your timeline. The value of qual testing is only realized if you act on what you learn. Schedule time between sessions to refine — ideally at least a day between every two interviews.

Test the instructions as hard as you test the items. MaxDiff instructions are notoriously under-tested. Respondents often don’t read them carefully the first time, and when they do, they frequently misinterpret the repeated-comparison logic. Your instructions may need to work harder than you think.

Watch for “compliance without comprehension.” The most dangerous respondent isn’t the one who says “I’m confused” — it’s the one who says nothing and clicks through anyway. Probe actively. “Can you tell me how you decided on that one?” will reveal a lot.

The bottom line

MaxDiff is a powerful methodology precisely because it surfaces real priority trade-offs. But that power is contingent on respondents actually understanding what they’re being asked to do. When comprehension breaks down — and in our experience, it breaks down more often than researchers expect — what you get isn’t preference data. It’s artifact.

Qualitative interviews before launch aren’t just best practice when designing a MaxDiff survey. They’re the step that makes the quantitative data worth trusting.

If you’re planning a MaxDiff study and want to build a qualitative testing phase into your design — or if you’re trying to figure out whether your current instrument is ready to field — we’d love to talk through it.

InterQ has been designing and fielding MaxDiff studies with embedded qualitative validation for B2B and consumer clients for over a decade. Let’s make sure your survey is field-ready before it goes out.

Request a Proposal >

MaxDiff surveys
Qualitative research
Survey design
Best-Worst Scaling
Feature prioritization
Cognitive interviewing
B2B research