It's a Pain to Research SBIRT

Screening, Brief Intervention and Referral to Treatment has a lot of moving parts. That makes measurement tricky.

By now, most people in the addiction field have heard about the two studies published in the Journal of the American Medical Association that show SBIRT doesn’t work.

Wait, what was that? Okay, I just said that to get your attention.

Actually, 2014 studies by Roy-Byrne et al. and Saitz et al. concluded that a brief intervention (or series of brief interventions) about illicit drug use in primary care did not a) decrease patients’ drug use, b) decrease behaviors like unsafe sex or other HIV risk behaviors, c) increase behaviors like treatment-seeking and mutual aid attendance, or d) improve patients’ health.

Reading the studies, and some reactions to them (here, for example, and here), has gotten me thinking about some of the finer points of researching SBIRT.

Doing research on SBIRT is a complex and time-consuming enterprise. What is SBIRT? What does it mean for it to “work”? These questions aren’t as simple as they seem.

The What

Let’s start with the intervention itself. What is it? At first glance, this may seem straightforward: SBIRT is the intervention. Except that SBIRT encompasses a couple different evidence-based practices: screening and brief intervention. And it includes an intervention for which there isn’t much evidence base, referral to treatment. There is research supporting treatment itself, but not much on the process of referring to treatment.

Furthermore, there are various approaches to each of these components.

Screening is probably the first thing people think of when they think of SBIRT. There is strong evidence for the validity and reliability of several screening instruments to detect risky substance use. But not all screens look the same. For alcohol alone, screening instruments range from one to ten questions and beyond. Same story for drugs. There is a validated one-item screening questionnaire for drug use, just as there is for alcohol.

Are these screens all created equal in terms of study design? Perhaps not, as we will see when we talk about assessment reactivity below. And the screen is the (relatively) simple part of SBIRT!

Then we get into the brief intervention, the varieties of which can get mind-boggling. What model? How many sessions? How long? How frequent? And on and on.

Finally, SBIRT also encompasses the referral to treatment, easily the least-studied component of the three. Lots of people talk about “warm handoffs,” for example, between general medical settings and specialty addiction treatment. But there isn’t agreement about what that ought to look like.

What It’s Not

Along with the intervention itself, we need to talk about to what we are comparing the intervention. We call this the control condition. This is a sticky issue in SBIRT research for a number of reasons. Let’s go back and think about the screen.

There is a phenomenon called “assessment reactivity,” whereby when you measure a behavior, that behavior changes even if no attempt is made to change it. If you’ve ever used one of those tracking tools to measure your eating or exercise habits, you have been affected by this. Try it. For most people, if you write down what you eat for a week, you will eat less than if you had not been tracking. Similarly, if you are asked reasonably detailed questions about how much you drink over a period of time, you are likely to drink a little less than you would otherwise.

As a result, control groups in brief intervention (BI) studies consistently show decreases in alcohol consumption outcomes.

This Fitbit has assessment reactivity written all over it. Credit: Alper Cugun

Just by comprehensively screening/assessing people in a BI study, we may be accidentally giving both groups an “intervention” that motivates them to reflect on and reduce their drinking.

What we wanted was for one group to get an intervention (the BI) while those in the other get “usual care,” or some minimal form of feedback/attention.

But, because of assessment reactivity, both groups can end up reducing their drinking, and then we don’t see enough of a difference in the drinking rates/quantity between the experimental (BI) group and the control (usual care) group. There may be a difference, but it isn’t statistically significant (i.e., it’s small enough that it could have still occurred just due to chance.) This means we have to conclude that the intervention did not appear to be efficacious for reducing drinking.

But is this truly the case, or did we just inadvertently mask the effect of the intervention by causing both groups to reduce their drinking through too much assessment at baseline?

This simultaneously points to the power of the screen and the difficulty of measuring SBIRT’s effectiveness. That is, if you have to measure alcohol use to determine who gets a brief intervention, you have likely already had an effect on that patient’s alcohol use and begun SBIRT. On the other hand, how can you track the outcome of alcohol use in the control condition if you are not measuring it?

Sheesh. It’s enough to cause a researcher to throw up her hands.

The Who

The population of study is critical when interpreting efficacy and especially effectiveness. Why? Despite the notion of SBIRT as a universal public health model, no one has claimed it is effective for everyone.

In thinking about alcohol users, let’s refer to the classic drinker’s pyramid. SBIRT is potentially appropriate for the large majority of people in the middle of the pyramid, those who would call themselves “social drinkers” and who may be drinking a little more than established “low risk limits” but probably do not have current alcohol use disorders.

However, the target population becomes muddier when we talk about other substance use. There is no such thing as a drug use pyramid, to my knowledge, nor are there published “low risk limits” for drug use. And which drugs are we talking about, anyway? So who is The Who when we’re researching SBIRT for drug users?

The How

How the outcome is measured is another big consideration. What does it even mean to say “SBIRT was effective”? What if the person smokes one fewer joint per week? Do we care about such a small improvement? How about three fewer joints? How do you determine the amount that is important?

What if they started going to AA or NA? How about if they entered traditional substance abuse treatment, even though they are using the same amount as they were before? It seems like this should count as a success by some measure, although most efficacy and effectiveness studies focus on amount and frequency of use, since entering treatment is 1) a relatively rare occurrence and 2) based on many factors which may or may not have anything to do with actual treatment need.

Continuing SBIRT Research

Finally, I have to say that I was rather interested in these JAMA studies in part because they are a rare example of “negative findings” being published. The term “negative” means that the findings were not statistically significant: in this example, that the patients receiving SBIRT did not do better (or worse) than those not receiving SBIRT based on the outcomes being measured. And you hardly ever see studies with non-significant results published in journals. (No kidding.) The issue, in part, is that negative findings are often seen as attributable to a weakness in the study, rather than an indicator that maybe the treatment under study actually is not effective. These recent studies were published because they were very well-designed, but they–like all research–had limitations.

It turns out there is no such thing as the perfect outcome study. Even the most well-funded, well-designed study done by the most well-regarded researchers have limitations. Good journals won’t publish your study if you don’t mention them. Rather than reading limitations as reasons to disregard the research, they rather should be read as just what they are: indicators of the limits to validity or generalizability, and opportunities for further studies.

I hope this sheds some light on what a pain it is to research SBIRT and why we are all indebted to those who publish individual studies and meta-analyses of that research over time. Treating substance use as a public health issue is an important conceptual shift; actually figuring out how to do that is a complex process.

Recommended Resources

Dr. Lindsay recently led a webinar called “Anything Worth Doing is Worth Measuring: How to Evaluate SBIRT Programs.” Video and slides from the webinar are available for on-demand viewing in our Webinar Library.

Dr. Dawn Lindsay is the Director of Evaluation Services at IRETA. In this capacity, she is the evaluator of the National SBIRT ATTC and oversees other research and evaluation activities at IRETA. Prior to joining IRETA, she conducted research on adolescent substance use disorders in the Department of Psychiatry at the University of Pittsburgh. She is a member of the American Psychological Association and American Evaluation Association.

It’s a Pain to Research SBIRT