There has been a recent increase in research evaluating treatment-based subgroups of non-specific low back pain. The aim of these sub-classification schemes is to identify subgroups of patients who will respond preferentially to one treatment as opposed to another. Our article provides accessible guidance on to how to interpret this research and determine its implications for clinical practice. We propose that studies evaluating treatment-based subgroups can be interpreted in the context of a three-stage process: (1) hypothesis generation-proposal of clinical features to define subgroups; (2) hypothesis testing-a randomised controlled trial (RCT) to test that subgroup membership modifies the effect of a treatment; and (3) replication-another RCT to confirm the results of stage 2 and ensure that findings hold beyond the specific original conditions. At this point, the bulk of research evidence in defining subgroups of patients with low back pain is in the hypothesis generation stage; no classification system is supported by sufficient evidence to recommend implementation into clinical practice.