"Rigor" in Quant UX Research

I’ve recently had discussions with other Quant UXRs about “rigor” and what it means in our work. I’ve often heard it raised as a question over the years, and this prompted me to compile some thoughts.

I propose a stack with 4 levels of how I view “rigor” … and first, I’ll start with what is not rigor.

What Is Not Rigor?

Rigor is not a synonym for advanced methods. In fact, I believe that using so-called advanced methods (such as, say, causal modeling, deep learning, multilevel regression) is too often a crutch that people grab when they are confronted with a serious problem apart from analytics. A few of those problems are:

Insecurity. They want to use advanced methods to say “look how smart I am”
Poor data. They hope to rescue something when the data set is a mess
Wanting to learn something. They’d like to stretch the boundaries of their knowledge

I am not saying those problems are bad or wrong to address. Also, I am not saying that every use of an advanced method is wrong. Far from it! Quants should happily use any method that is appropriate. Rather, I am simply observing that complex methods are often chosen for reasons other than statistical suitability. And even when they are suitable, a complex analysis is not inherently more rigorous than a simple one.

Rigor is not statistical significance or p-values. First of all, as a Bayesian, I don’t find the concepts of statistical significance and p-values to be very useful, for reasons similar to those described by Frank Harrell (who TBH is a much more knowledgable statistician than I am). Setting Bayesian stats aside, even in the frequentist world statistical significance is closely tied to null hypothesis significance testing (NHST) … and in my opinion that’s not what quants should do most of the time (see Cohen’s famous takedown of p-values and NHST).

Instead, I believe quant UXRs should use data to learn and inform decisions. Most of that process cannot be forced into an NHST paradigm, and assigning a p-value to some portion is misleading at best.

In short, statistical significance is a misleading concept. Unfortunately, those who ask about it most often may understand it the least. Again, that’s not 100% of the time and NHST has occasional uses (some forms of A/B and multivariate testing sometimes are exceptions).

Rigor is not massive data. I often see studies that attempt to collect massive data, either at a sample level (such as 10000s of survey responses, or 100Ms of online users) or at a response level (such as collecting 100s or 1000s of variables per observation). The problem is that massive data is not necessarily good data, and analysis of it may increase the odds of finding spurious associations and misleading results.

In other words, a bigger load of garbage is still a load of garbage, and “advanced methods” (see above) won’t save it. Getting more data is not more “rigorous,” it is just more data, good or bad.

Rigor is not an unwavering protocol. This is a problem I sometimes see when a researcher transitions from academia to industry. One common form is to put every UX research participant through an identical set of instructions, tasks, and data collection methods. This may occur in interviews, usability tests, or focus groups, as well as surveys. The usual motivation is to collect identical, consistent, and “unbiased” data.

The problem with a rigid protocol is this: it maximizes the defensibility of a data against criticism from other researchers (such as journal reviewers) but it does not maximize what we learn from our participants. I vastly prefer to engage with participants and adapt an interview or other protocol in response to what I hear. If that gives “incomparable data,” then that is my problem to determine how to have impact with stakeholders … but however I resolve that, I will have learned more from participants than I would by subjecting them to a tedious, unwavering test. And that maximizes business value of the research.

In summary, rigorous research does not imply or necessitate:

Advanced methods
A focus on hypothesis testing, “statistical significance,” or p-values
Massive data sets
Unwavering protocols for interviews or other data collection

What is Rigor?

You may have noticed above that I repeatedly suggested that the point of research is to learn from our users, customers, and research participants and to inform product or other business decisions. To do that rigorously is to learn deeply and to focus relentlessly on making decisions well.

To those ends, I propose the following four area of research rigor. I believe there is some degree of logical order to these areas such that each one presupposes and is built on the previous one, yet we could also view them as complementary rather than strictly nested.

Rigor 1. Focus on Learning and Decisions

The first area of rigor is to focus on decisions that we need to make, and how to learn appropriately from customers, users, and others to make those decisions.

This may seem obvious and yet it often goes wrong in industry UX practice. For example, a classic usability test to find usability issues may go wrong if there is no process in place to act on and fix the issues that are found, or if management will simply dismiss them as “by design” or “won’t fix.” Likewise, a customer satisfaction (CSat) tracker is of no use unless there is a process in place to do something with the results.

To focus on decisions doesn’t mean to reduce everything to an A/B test or a specific product decision. A “decision” may mean a product action such as changing a feature but it may also mean to take some other action. For instance, if CSat drops on a tracking survey, an appropriate decision may be to launch a follow-up survey, to do usability testing, or to conduct depth interviews.

The important point is to have a decision process in place before research commences. That helps both to frame the research and to know what to do with it rather than simply hoping for “impact.”

Rigor 2. Quality Data and Methods

Once we know what decision(s) to influence, we need high quality data and appropriate research methods. This area is most similar to traditional conceptions of “rigor” although I would emphasize a few differences. For one thing, as I noted above, it doesn’t imply rigid data collection or advanced methods.

Also, to address a decision point rigorously, we typically need multiple kinds of information. The proverbial “quant-qual sandwich" (alternating quantitative and qualitative projects) is one approach. Perhaps the most common question in this realm is “why?” Suppose customers prefer some potential feature on a survey. That answer is insufficient; we also need to know “why?” We may also wish to know, “what else?” and insight into “instead of what?” … for example, whether their preference comes at the expense of brand perception, another product of ours, an improved competitive position, and so forth.

From a methods point of view, this implies two things. First, the methods required may be straightforward. If multiple sets of data agree or disagree, there is no need for fancy analysis. Second, it implies that simply “getting an answer” is not enough. The point of having a specific decision frame (“Rigor 1” above") is not to exclude broader learning; rather, it is to set a target that we can hit directly and, ideally, surround with complementary insight.

Rigor 3. Effective Stakeholder Engagement

The third aspect of rigor is to ensure that you have both the relevant processes and skills to engage with stakeholders. Too often, I meet Quant UX researchers who hope that “the data speak for themselves.” But data sets don’t speak — that is our job.

In an ideal situation, a research team may have multiple members who divide the tasks of stakeholder engagement, research operations, analytics, and so forth. It is unreasonable to expect a single researcher to be highly competent in all of those areas.

Now, you might wonder, “Wait? Why is stakeholder engagement at level 3 instead of being a baseline level 1?” The reason is simple: I’ve met too many UXRs, marketing researchers, and even academic researchers who are great at stakeholder engagement but who do not have the technical skills to deliver quality research. Without quality research, great stakeholder engagement and storytelling too often end up as random walks or exercises in self-congratulation.

What is effective engagement? I would identify a few aspects. It is decision-oriented and the results are both clear and directly informative of the decisions to be made. Second, it pushes back on stakeholders to clarify the questions and say “no” when a proejct or question is infeasible. Third, it is unafraid to deliver negative results even when they are unpopular, such as, “No, users just don’t want that.” Fourth, it does this in a way that stakeholders will hear and value the results, and will want more in the future.

Is all of that a tall order and difficult to accomplish? Yes, and that’s why it is level 3 in my stack of “rigor”.

One other note: too many organizations gaslight researchers into believing that it is a researcher’s fault for not having enough “impact” or poor stakeholder engagement. These relationships require two parties (at least) and failure, poor fit, or less-than-useful results do not arise because of researchers alone. Sometimes the best way to have better stakeholder relations is to change stakeholders (i.e., teams, organizations, jobs, or industries).

As a side note, I’ll call out an implication of areas 1 + 2 + 3 when they are taken together: Rigor is not about the quantity of research, but about how well it addresses crucial problems.

It is better to determine the most important stakeholder questions and address them head-on than to produce an endless streaming of learning about users or answering small-stakes questions. And that leads us to the final area …

Rigor 4. Attention to Higher-Order Strategic Decisions

The final area of “rigor” builds on effective research and stakeholder engagement to deliver broader insight for a business. Unlike areas 1, 2, and 3 above, this is a relatively rare area but one that is rewarding to both organizations and researchers when and if (and only if) they are prepared for it.

This involves attention to issues including:

Opportunity costs of both product decisions and research engagement. If we do X, what does that mean for all of the things, T, U, V, and W that we did not do? If we spend time researching hot question of the day, what are we are not learning that could be more valuable?
The asymmetric payoff matrix for all decisions and findings. For any given research result, there is some chance we will be wrong; and for any decision there is an expected benefit as well as potential cost. When we put those together, we can better inform product and business decisions. A full consideration of this area is far beyond the scope here, but I will note two extremely common failure points in this area: (a) a presumption that a product, feature, service, or business will happen, paired with a research demand to “validate” it, failing to consider the alternative that the business may be unneeded or impossible; and (b) a presumption that research should occur just because a question is “important.”
Models for research allocation and staffing. With good stakeholder engagement, credibility, and skills in broad research for decision making, we can turn those skills to address questions such as, “What should we be doing with research?”, “What are the things we don’t know?”, and “What is the right mix of research staffing?” The skills needed to answer those questions are exactly the same as to answer any other research question. If you’re thinking, “wait! that would be a massive research project” then I’ll point out again that research does not require “statistical significance” or large data sets. Instead, it requires an important decision plus learning to answer it … and even when an answer is imperfect, the expected value of not answering it is much worse than getting an approximate and useful answer.

There is one particular failure mode of “strategic UX research” that I would like to call out: research into product strategy that is untethered to any particular decision or decision process, and that instead is merely hoping to find influence by calling itself strategic. Too many projects in “foundational research” fall into this category. They learn a lot about customers but do not have the decision orientation, stakeholder buy-in, quality data, and attention to asymmetric results that are required actually to inform strategy. Sadly, such projects all-too-often end up with the UX researchers showing up like junior product managers advocating for some particular direction in hopes they will win a battle for influence.

A common question about Level 4 strategic research is, “how do I get there? how can I sell it?” My take is that it can’t and shouldn’t be pressed as a direction by researchers. Instead, it requires (a) building the skills needed to do it, such as depth in game theoretic thinking, (b) great stakeholder engagement based on prior, more focused success, and (c) waiting for the right time and set of questions.

Conclusion

I’ll boil this post down to two sentences:

UX Research rigor not a property of methods or data. Instead, it is about informing effective decisions, in conditions of uncertainty, through appropriate learning from our users and customers.

If I had to emphasize one skill for effective and rigorous research — and really, it is a frame of thought more than a skill as such — it would be to ask “what if?” incessantly. What if … I used 4 small sample methods instead of 1 large project? What if … the answer from customers is “no”? What if … we didn’t do this project? What if ... we shipped this feature or this product and we are wrong?

When that orientation is paired with individual skill in data collection and analysis, you will be well on the way to having truly “rigorous” research!

"Rigor" in Quant UX Research

What Is Not Rigor?

What is Rigor?

Rigor 1. Focus on Learning and Decisions

Rigor 2. Quality Data and Methods

Rigor 3. Effective Stakeholder Engagement

Rigor 4. Attention to Higher-Order Strategic Decisions

Conclusion

Comments

More from this blog

My UX Research "Rolodex"

Qualitative Inquiry Can Make Quantitative Work More Meaningful: But Which Comes First?

Things I'm Hearing about UX Research

Two Year Highlights of the Blog

Command Palette

What Is Not Rigor?

What is Rigor?

Rigor 1. Focus on Learning and Decisions

Rigor 2. Quality Data and Methods

Rigor 3. Effective Stakeholder Engagement

Rigor 4. Attention to Higher-Order Strategic Decisions

Conclusion

Comments

More from this blog