Stratified Psychiatry

Stratified Psychiatry: Tomorrow’s Precision Psychiatry?

The  development of a biomarker based ‘precision psychiatry’ or personalized medicine in psychiatry is considered a crucial step in moving away from the current one-size-fits-all psychiatry, with its mediocre clinical outcomes. However,  methodological as well as ethical issues form an obstacle to realise such a paradigm change in psychiatry in a timely manner. However, ‘stratified psychiatry’ as an intermediate step before precision psychiatry is a development that can expedite this paradigm change, by using biomarkers to stratify patients between know-effective and established treatments. In the below this will be explained in more detail using examples of stratified psychiatry in depression and ADHD.

Background: Personalized Medicine

Current psychiatry is transitioning from a one-size-fits all, DSM-IV/5 or ICD-10 dominated psychiatry to a future of precision or personalized psychiatry, informed by Research Domain Criteria (RDoC) or transdiagnostic Hierarchical Taxonomy of Psychopathology (HiTOP) approaches. Almost a decade ago the RDoC framework was presented, and has become a requirement to obtain NIMH funding, which has led to an explosion of studies, albeit no clinical applications of RDoC have been implemented in clinical practice yet. Instead, the way clinicians diagnose and handle psychiatric syndromes and disorders has not changed since decades. Medical decisions are still based on average efficacy and side-effect rates, personal experience and error-prone anamnestic information rather than objective biomarkers and frameworks. Here, we argue that a ‘stratified psychiatry’ approach that utilizes biomarkers to stratify patients to existing – on-label – established treatments is a realistic conceptualization of precision psychiatry. Here, biomarkers are employed to identify specific inter-individual properties that make an individual patient more susceptible to responding to one relative to another treatment. We propose that stratified psychiatry is a pivotal, indispensable step towards precision psychiatry, also visualised in figure 1 below on the distinction between current ‘one-size-fits-all’ psychiatry, stratified psychiatry and precision psychiatry.

stratified psychiatry

Figure 1: An infographic summarizing the more ‘diagnostic based one-size-fits-all psychiatry’ that is currently in use (left), to more ‘prognostic’ models such as Stratified Psychiatry (right-top) and Precision Psychiatry (right-bottom). In an ‘ideal operationalization,’ Precision Psychiatry can be conceived as a way of treating mental disorders based on replicable and objective markers (ranging from imaging to genetics and transcriptomics) that form an individual profile for every individual patient. The treatment interventions are tailored to the profile of the individual patient, thus addressing unique properties of individual patients and maximizing clinical response. Alternatively, Stratified Psychiatry is a way of subgrouping patients with similar biomarker-profiles to enhance the probability of clinical response to known and established treatments within a given disorder.

In current clinical practice we often rely on a so called ‘stepped care’ model, where selection of treatments is mostly informed by initial efficacy and side effects, where in depression psychotherapy, antidepressants such as SSRI’s and SNRI’s are considered first choice treatments and Electroconvulsive Therapy (ECT) and Deep-Brain Stimulation (DBS) are at the other end of the spectrum, mainly based on the side effect profiles and invasive nature of the treatment. However, as visualized in figure 2 below, various large studies and meta-analyses have demonstrated on the group level a lack of superiority within treatment-classes (e.g. different types of psychotherapy, different rTMS protocols and different drug-classes of antidepressants), as well as a lack of superiority across treatment classes (e.g. psychotherapy vs. antidepressants (Cuijpers et al., 2021)). Therefore, if on the group level there are no differences between these treatments, and thus assigning patients to one of these treatments by flipping a coin cannot be considered unethical or harmful, the question arises, why don’t we use biomarkers to ‘stratify’ patients to one of these treatments with the aim to enhance treatment response within sub-groups?

Several years ago, biomarkers that predicted response tended to be more generic predictors for non-response, i.e. the more abnormal the biomarker the worse someone’s response to treatment. However, recently, data from sufficiently powered large scale studies such as the International Study to Predict Optimized Treatment in Depression or iSPOT-D have painted a different picture. iSPOT-D recruited 1008 MDD patients that were randomized to escitalopram, sertraline and venlafaxine and among other biomarkers, electrophysiological features (EEG and ERP’s) were assessed before and after 8 weeks of treatment. Several studies from this trial have demonstrated that biomarkers can be 1) sex-specific, often cancelling-out effects in the combined group level without separating in males and females (Arns et al., 2018, 2015; Dinteren et al., 2015), 2) drug-class specific (e.g. SSRI vs. SNRI (Arns et al., 2015)) and 3) drug-specific (Arns et al., 2016a), where clinical response can even be differentially predicted to two drugs from the same drug-class. This new reality has further opened-up possibilities for stratified psychiatry as an interim step towards a future of precision psychiatry. In the following we will illustrate the concept of stratified psychiatry further using recent examples in the treatment of depression (MDD) and ADHD, using electroencephalography (EEG) biomarkers. However, note that other biomarkers such as pharmacogenetics (for review see: Schaik et al., 2020) and MRI (Cohen et al., 2021), and especially integration of biomarkers across domains and treatments are exciting avenues to further advance the notion of stratified psychiatry.

response and remission rates

Figure 2: Group-level response and remission rates for antidepressant treatments (psychotherapy, rTMS, antideprressants and Ketamine) derived from the largest non-industry sponsored effectiveness trials or meta-analyses demonstrating indistinguishable response and remission rates between pharmacological and non-pharmacological treatments (Cuijpers et al., 2021), as well as a lack of significant differences within modalities (n.s.; non-significant), as demonstrated by Cuijpers et al. (2014) for various psychotherapies, for three different and widely prescribed antidepressants from iSPOT-D (Saveanu et al., 2015) and for two different forms of rTMS by Blumberger et al. (2018)

Stratified Psychiatry in Depression (MDD)

In depression various EEG biomarkers have been described, (reviewed in more detail elsewhere Olbrich et al., 2015). Without providing an exhaustive review, we here focus on one line of research to illustrate the treatment stratification concept, and also address some of the concerns raised by Widge and colleagues.

In 2001, Bruder and colleagues reported right-hemispheric alpha dominance for female responders to an SSRI (Bruder et al., 2001), which was replicated and confirmed in the iSPOT-D study. A right-frontal dominance of alpha (Frontal Alpha Asymmetry, FAA) was associated with remission to the SSRIs escitalopram and sertraline, in female depressed patients only (Arns et al., 2015). Interestingly, no effects were found for males and venlafaxine (SNRI) remitters vs. non-remitters. A retrospective assignment of female patients with a FAA <0.0 to the venlafaxine and patients with FAA of >0.0 to an SSRI resulted in a 53% and 60% remission rate respectively, which can be considered a clinically meaningful improvement relative to the overall remission rate of 46% after randomization. This corresponds to a Number Needed to Treat of ca. NNT=7 -an often-underrated measure for comparison of treatment decisions in psychiatry (Pinson and Gray, 2003) which is close to an approximated NNT = 5 for remission after psychotherapy in MDD (Pinson and Gray, 2003). This means, as many patients could benefit from an EEG-based stratified treatment decision in comparison to unstratified treatment as would benefit from psychotherapy in comparison to clinical management only. In addition to this sex- and drug-class specific EEG Biomarker, two further drug-specific EEG Biomarkers were reported, namely paroxysmal EEG activity and individual alpha peak frequency (iAPF). The presence of paroxysmal (epileptiform discharges) activity deemed a 3.2 lower likelihood of response to escitalopram and venlafaxine, and an opposite non-significant trend for sertraline (Arns et al., 2016a). In addition, a slow iAPF was found in responders to sertraline and no differences for escitalopram and venlafaxine (Arns et al., 2016a).
As explained above, if indeed prescribing one of the three iSPOT-D antidepressants based on flipping a coin does not yield better or worse outcomes (also see figure 2 above), we reasoned that exploiting the observed differential response profiles for these EEG biomarkers could inform the right medication choice. We thus set-out to conduct the first prospective trial where these three biomarkers were used to assign patients to one of three antidepressants. The study was mainly a feasibility study, yet a significantly better clinical outcome was found for the EEG-informed group relative to the treatment-as-usual group (Vinne et al., 2021), thereby also providing the first prospective validation for the use of EEG for treatment stratification. Interestingly, to address the concerns raised by Widge et al. (2019), Ip, Olbrich and colleagues (2021) recently performed an independent replication study on a range of EEG Biomarkers in which the FAA findings were independently replicated, thereby demonstrating carefully conducted out-of-sample validation as well as prospective utility.

Stratified Psychiatry in ADHD

Various EEG Biomarkers have been described in the ADHD literature, with the Theta/Beta ratio as most frequently cited diagnostic biomarker, touted to have received FDA clearance (though see a critical appraisal here (Arns et al., 2016b)). Nonetheless, this metric has not held-up well in meta-analyses (Arns et al., 2013; Saad et al., 2018) thereby questioning its diagnostic usage. Various studies however, have investigated the utility of the EEG to predict treatment response. Using a qualitative EEG approach, in 2008 we reported that especially male children and adolescents with ADHD and a slow iAPF responded most poorly to treatment with methylphenidate (MPH) (Arns et al., 2008). Sometime later, in the large multicenter iSPOT-A study that enrolled 336 children and adolescents with ADHD who were all treated with MPH, this initial finding was replicated and refined, where slow iAPF was most specifically associated with non-response in male adolescents with large effect sizes (Arns et al., 2018). However, not until recently the potential clinical relevance of this biomarker became evident, when Krepel and colleagues reported that ADHD children and adolescents with a slow iAPF actually responded significantly better to neurofeedback compared to those with a faster iAPF (Krepel et al., 2020). This also implicated the iAPF as a first trans-diagnostic EEG biomarker, with clinical implications for both depression and ADHD.
Inspired by these findings and noting that most studies had actually used different methods to calculate iAPF, we set out to optimize the algorithm behind this metric and validate it against a ground truth scenario of brain maturation in a sample of 4,249 patients. The most biologically plausible permutation was then prospectively validated on MPH (iSPOT-A N=336) and neurofeedback (Krepel et al., N=136) treatments (Voetterl et al., 2021). Results confirmed that a low iAPF was associated with lower remission rates after MPH, but higher remission rates after neurofeedback, with stratification simulations demonstrating 20-27% increased remission rates if patients were only stratified to treatment based on iAPF. Two subsequent blinded out-of-sample validations in 1) a MPH trial from Loo and colleagues (Loo et al., 2016) and 2) a Neurofeedback trial from the ICAN group (Group et al., 2020) confirmed the correct predictions and demonstrated meaningful gains in remission when selected using this iAPF based Brainmarker-1 (Voetterl et al., 2021).

‘Let’s make a deal’, the Monty Hall problem: How partial information can be clinically meaningful.

An interactive explanation of the Monty Hall problem

One notion from the above examples that is often hard to comprehend is that the lack or presence of a significant effect for one treatment vs. another treatment can actually be clinically meaningful. For example, a left dominant FAA in females was significantly associated with lower likelihood of remission to an SSRI, whereas no effect for FAA was found for the SNRI venlafaxine. Assuming all groups are sufficiently statistically powered, such a lack of a difference is actually a blessing in disguise, clinically speaking, since, if a patient would present with right dominant FAA, would you prescribe that patient with an SSRI? Or rather consider the SNRI venlafaxine for this patient? Following this logic, the original simulation in MDD as well as the prospective replication yielded a 28-70% relative increase in remission rates (Arns et al., 2015; Vinne et al., 2021).
In the case of triggering ‘life-and-death’ decisions, high sensitivity and specificity are crucial, since the prediction of ‘life’ should really outweigh the ‘risk of death’. However, in stratified psychiatry the goal is to assign someone to one-of-many established treatments, that on the group level have similar response and remission rates. Thus, even when a wrong prediction is made, no harm is done relative to the ‘one-size-fits-all practice’. Therefore, knowing fairly sure someone will not respond to a given treatment, increases someone’s chances to ‘anything’, except this treatment. Although not changing the probability of response to other treatment forms, the possibility to avoid a non-response increases the overall chance choosing a sufficient treatment within a non-endless space of treatment forms. Again, this also highlights the need for markers of different treatments and the corresponding probabilities for response and remission.
This can be further explained by the Monty Hall problem, derived from the game show ‘Let’s make a deal’. In the Monty Hall problem the contestant in a game show has to pick one out of three doors, with the initial probability of picking the Ferrari (aka achieving remission) being 33% (i.e. representing current one-size-fits-all practice, and actually quite close to true remission rates in MDD treatments as visualized in figure 1). Then the gameshow host opens one door showing a goat, with the question if the contestant wishes to change doors or not. Counter intuitively, the only correct choice is to switch doors, since with the new information provided by the host, the probability has increased to 67% after switching (e.g. see: (Saenen et al., 2015)). Similarly, in stratified psychiatry we leverage any available information to increase remission rates. In our case, stratification biomarkers are the ‘new information’ about one specific treatment brought into the equation, whereby the probabilities to achieve remission increase substantially in case of 3 options to 2/3rd odds, relative to the initial 1/3rd odds. Although it is not clear whether there actually is an effective treatment for everyone, this story teaches us that it would be a good choice to take any information that is available into account.


Taken together, stratified psychiatry holds the possibility to increase response and remission rates in psychiatry without the need for the development of new treatments, but only by assigning the right, already approved treatment type, to a certain patient using a biomarker as an addition to the toolbox of the clinician. Taking both the outcome on a biomarker and other individual variabilities (e.g. symptom profile or sleep hygiene) into account would already allow psychiatry to make even more informed and personalized decisions. Moreover, this would allow psychiatry to overcome the trial-and-error approach with starting treatments, monitoring response and the need to switch after several weeks (e.g. taking the guesswork out of stepped care). However, although these outlines are promising and by far not restricted to electrophysiological research, several key aspects have to be addressed to make the paradigm-shift a reality. Above all, the need for a more standardized assessment of psychiatric symptoms and side effects is necessary to compare the predictive power of treatment modalities between studies and across different kinds of biomarkers (electrophysiological, imaging, genetic, etc.). This call for standardization is not only relevant for the psychometric dimensions, but also holds true for the markers themselves and the conditions under which they are assessed. While many biomarkers are suffering from missing replication, this topic can only be tackled when studies rely on standardized operating procedures for used assessment hardware (e.g. amplifiers), preprocessing steps, algorithms for marker calculation, assessment environment etc. This also calls for more collaboration and data-sharing. One such open-access database consisting of >1200 EEGs, clinical descriptors and treatment outcome data (TD-BRAIN, van Dijk et al., 2022) is actually available online at our website.
Another useful model we would like to propose is ‘blinded data sharing’, where researchers that have identified biomarkers, prospectively and blindly apply them to biomarker data from other research groups without knowing the patients’ response status. The predictions can then be shared back with the original researchers who can confirm or reject the predictive power. Such ‘blinded data sharing’ has the advantage that researchers can still ‘control’ what is done with their data, and on the other hand can act as a true blinded out-of-sample validation, maybe constituting the biomarker-equivalent of what a double-blind placebo-controlled trial is for the evaluation of clinical treatments.

Further reading

Wu, A., Zhang, Y., Jiang, J., Luca, M.V., Fonzo, G.A., Rolle, C.E., Cooper, C., Chin-Fatt, C., Krepel, N., Cornelssen, C.A., Wright, R., Toll, R.T, Trivedi, H.M., Monuszko, K., Caudle, T.L., Sarhadi, K., Jha, M.K., Trombello, J.M., Deckersbach, T., Adams, P., McGrath, P.J., Weissman, M.M., Fava, M., Pizzagalli, D.A., Arns, M., Madhukar H. Trivedi, M.H., Etkin, A., An electroencephalographic signature predicts antidepressant response in major depression. Nature Biotechnology

van der Vinne,, N., Vollebregt, M., van Putten, M.J.A.M., Arns, M. (2019). Stability of frontal alpha asymmetry in depressed patients during antidepressant treatment. NeuroImage. Clinical 24(), 102056.

van der Vinne, N., Vollebregt, M.A., Boutros, N.N., Fallahpour, K., van Putten, M.J.A.M., Arns, M. (2019) Normalization of EEG in depression after antidepressant treatment with sertraline? A preliminary report. Journal Of Affective Disorders, 259 (2019) 67-72 doi: 10.1016/j.jad.2019.08.016

Krepel, N., Rush, A. J., Iseger, T. A., Sack, A. T., & Arns, M. (2019). Can psychological features predict antidepressant response to rtms? A discovery-replication approach. Psychological Medicine. doi:

Benschop, L., Baeken, C., Vanderhasselt, M., Van de Steen, F., Van Heeringen, K., Arns, M. (2019). Electroencephalogram Resting State Frequency Power Characteristics of Suicidal Behavior in Female Patients With Major Depressive Disorder
The Journal of Clinical Psychiatry 80(6)

Arns, M., Vollebregt, M. A., Palmer, D., Spooner, C., Gordon, E., Kohn, M., . . . Buitelaar, J. K. (2018). Electroencephalographic biomarkers as predictors of methylphenidate response in attention-deficit/hyperactivity disorder. European Neuropsychopharmacology. doi:

van der Vinne, N., Vollebregt, M. A., van Putten, M. J., & Arns, M. (2017). Frontal alpha asymmetry as a diagnostic marker in depression: Fact or fiction? A meta-analysis. NeuroImage: Clinical.

Iseger, T. A., Korgaonkar, M. S., Kenemans, J. L., Grieve, S. M., Baeken, C., Fitzgerald, P. B., & Arns, M. (2017). EEG connectivity between the subgenual anterior cingulate and prefrontal cortices in response to antidepressant medication. European Neuropsychopharmacology; doi:10.1016/j.euroneuro.2017.02.002

Arns, M., Gordon, E., & Boutros, N. N. (2015). EEG abnormalities are associated with poorer depressive symptom outcomes with escitalopram and venlafaxine-xr, but not sertraline: Results from the multicenter randomized iSPOT-D study. Clinical EEG and Neuroscience. doi:10.1177/1550059415621435

Olbrich, S., Tränkner, A., Surova, G., Gevirtz, R., Gordon, E., Hegerl, U., & Arns, M. (2016). CNS- and ANS-arousal predict response to antidepressant medication: Findings from the randomized iSPOT-D study.Journal of Psychiatric Research, 73, 108-115. doi:10.1016/j.jpsychires.2015.12.001

Arns, M., Loo, S. K., Sterman, M. B., Heinrich, H., Kuntsi, J., Asherson, P., . . . Brandeis, D. (2016). Editorial perspective: How should child psychologists and psychiatrists interpret FDA device approval? Caveat emptor. Journal of Child Psychology and Psychiatry, and Allied Disciplines, 57(5), 656-8. doi:10.1111/jcpp.12524

Olbrich, S., van Dinteren, R., & Arns, M. (2016). Personalized medicine: Review and perspectives of promising baseline EEG biomarkers in major depressive disorder and attention deficit hyperactivity disorder. Neuropsychobiology, 72(3-4), 229-240. doi:10.1159/000437435

Arns, M., Etkin, A., Hegerl, U., Williams, L.M., DeBattista, C., Palmer, D.M., Fitzgerald, P.B., Harris, A., deBeuss, R. & Gordon, E. (In Press) Frontal and rostral anterior cingulate (rACC) theta EEG in depression: Implications for treatment outcome? European Neuropsychopharmacology.

van Dinteren, R., Arns, M., Kenemans, L., Jongsma, M. L., Kessels, R. P., Fitzgerald, P., . . . Williams, L. M. (2015). Utility of event-related potentials in predicting antidepressant treatment response: An iSPOT-D report. European Neuropsychopharmacology. doi:10.1016/j.euroneuro.2015.07.022

Arns, M., Bruder, G., Hegerl, U., Spooner, C., Palmer, D., Etkin, A., Fallahpour, K., Gatt, J., Hirshberg, L., Gordon, E. (2015). EEG alpha asymmetry as a gender-specific predictor of outcome to acute treatment with different antidepressant medications in the randomized iSPOT-D study. Clinical Neurophysiology 127(1), 509-19.