Monarch - SLE Molecular Decision Support Console

Import data

Drop your CSV here or click to browse No file selected

Feature coverage

Molecular fingerprint

Strong low Low Reference High Strong high

Key drivers

Low Intermediate High

Clinical review

Suggested actions

Clinical handoff

Biomarker explorer

Molecular fingerprint

Z-score calculation (NIST, n.d.)

\(z = \frac{\text{patient value} - \text{training mean}}{\text{training SD}}\)

Strong low\((-\infty, -1.2]\) Low\((-1.2, -0.45]\) Reference\((-0.45, 0.45)\) High\([0.45, 1.2)\) Strong high\([1.2, \infty)\)

Modelling workflow

01 Data collection

Three stage-specific cohorts were used so each model matched a distinct clinical question: diagnosis, next-visit activity, and treatment response.

Diagnosis: GSE72509, retrieved from NCBI GEO (Hung et al., 2015); whole-blood RNA-seq with 99 SLE and 18 control samples.
Progression: GSE65391, retrieved from NCBI GEO, for training (Banchereau et al., 2016); GSE49454, retrieved from NCBI GEO, for external validation (Chiche et al., 2014). Activity was defined as SLEDAI ≥ 6.
Treatment: GSE224705, retrieved from NCBI GEO (NCBI Gene Expression Omnibus, 2023); first-visit SLE samples with response derived from SRI-4 labels after removing healthy controls.

02 Pre-processing and EDA

Expression and metadata were cleaned before modelling. Diagnosis RPKM values were transformed as log2(RPKM + 1); treatment microarray values were filtered for annotated, non-control probes and collapsed to one probe per gene using NCBI Gene, HGNC, and Ensembl annotations (Bedre, 2023; Brown et al., 2015; Seal et al., 2023; Harrison et al., 2024).

Checked sample-level expression distributions and outcome balance.
Derived longitudinal t-to-t+1 labels for progression.
Standardised or imputed model inputs inside training workflows where required (NIST, n.d.).

03 Feature selection

Feature selection was performed before final modelling to reduce high-dimensional expression matrices to interpretable panels and to avoid leakage from external or held-out data.

Diagnosis and treatment pre-filtered to highly variable expressed genes, then combined RF-Gini ranking, PCA checks, Boruta, and biological curation (Breiman, 2001; Kursa & Rudnicki, 2010; Liaw & Wiener, 2002).
Progression selected expression probes from the training cohort only, removed direct leakage variables, then combined immune, clinical, treatment, temporal, engineered, and gene features.
Final progression features were ranked with random-forest importance on training data only.

04 Modelling

Candidate models were compared within each stage rather than forcing one algorithm across all tasks. The deployed app reads the selected R model objects and sends the frontend CSV payload to the backend for inference; probability displays are summarised in Predicted probability bands.

Diagnosis compared limma signature, random forest, and LASSO. Progression compared elastic net, random forest, and GBM. Treatment response compared limma signature, random forest, LASSO, elastic net, and linear SVM. See Validation metrics below.

Limma signature models use linear modelling with empirical-Bayes variance moderation for expression data (Smyth, 2004; Ritchie et al., 2015).
Random forests aggregate many decision trees trained on bootstrap samples and random feature subsets; the R implementation follows this classification and regression workflow (Breiman, 2001; Liaw & Wiener, 2002).
LASSO and elastic net are penalised regression models that shrink coefficients for feature selection, with elastic net combining L1 and L2 penalties (Tibshirani, 1996; Zou & Hastie, 2005; Friedman et al., 2010).
GBM uses gradient boosting: sequential trees are added to improve a loss function, while linear SVM uses a maximum-margin classifier (Friedman, 2001; Cortes & Vapnik, 1995).

05 Performance evaluation

Evaluation prioritised metrics that are more informative than raw accuracy for imbalanced clinical cohorts. Diagnosis and treatment used stratified 5-fold cross-validation; progression used patient-level cross-validation for tuning and one independent external test on GSE49454 (Fawcett, 2006; Brodersen et al., 2010). See Validation metrics and Metric notes below.

Validation metrics

Outcome	Model	Validation	AUROC	Macro F1	Balanced acc.	Accuracy	MCC

Limited < 0.70 Adequate 0.70-0.84 Strong ≥ 0.85

Metric notes

AUROC summarises threshold-free ranking of positive cases above negative cases (Fawcett, 2006).
Macro F1 averages class-wise F1 scores so minority and majority classes contribute evenly, while balanced accuracy averages class-wise recall to reduce majority-class bias (Sokolova & Lapalme, 2009; Brodersen et al., 2010).
Accuracy is the overall proportion of correct predictions and can look optimistic in imbalanced cohorts; MCC is a correlation-like summary of the full confusion matrix, where +1 is perfect agreement and 0 is no better than chance (Matthews, 1975; Chicco & Jurman, 2020).
The legend bands follow common discrimination heuristics: values below 0.70 are treated as limited, 0.70-0.84 as adequate, and 0.85 or higher as strong. These are visual interpretation bands, not clinical acceptance thresholds (Hosmer et al., 2013).

Predicted probability bands

Monarch - Developers

Manna Berry Development of Progression Model & Assistance with Backend

mber0347@uni.sydney.edu.au Faculty of Engineering J12, The University of Sydney, NSW 2006

Lezhi Lin Development of App Frontend & Presentation Slides

llin0935@uni.sydney.edu.au School of Mathematics and Statistics F07, The University of Sydney, NSW 2006 Australia

Udit Samant Development of Diagnosis Model & General App Backend

usam6049@uni.sydney.edu.au School of Computer Science J12, The University of Sydney, NSW 2006 Australia

Hadi Shafat Interdisciplinary Aspects Research & Assistance with Backend

hsha0153@uni.sydney.edu.au School of Computer Science J12, The University of Sydney, NSW 2006 Australia

Jillian Zhao Development of Treatment Model & Assistance with Backend & Background Research

yzha0369@uni.sydney.edu.au School of Computer Science J12, The University of Sydney, NSW 2006 Australia

Acknowledgment

We acknowledge the Gadigal of the Eora Nation, the Traditional Custodians of the land on which the University of Sydney stands, and pay our respects to Elders past and present.

This prototype is submitted in partial fulfillment of the assessment requirements for DATA3888 Data Science Capstone at The University of Sydney. Our work also rests on the work of open-source maintainers across R, Bioconductor, and the modelling libraries used here, as well as the DATA3888 teaching team for project structure, feedback, and course support.

We're extremely grateful to our supervisors, Elyna Lin and Dr. Andy Tran, for the weekly feedback, guidance, and support they gave us through every workshop and consultation. Andy deserves a special shout-out, for keeping the whole unit running, for being genuinely lovely to work with, and for all the little things that added up: squeezing in last-minute consultations, replying almost instantly, helping us tidy up the structure and readability of our work, and quietly nudging us toward the further research behind our action and handoff guidance. Above all, we thank both of them for the genuine interest they took in our project.

We acknowledge the original data contributors and study participants behind the public GEO cohorts. Their shared expression and clinical metadata made the modelling, validation, and patient-level demonstrations possible.

We acknowledge the use of AI-assisted tools to support drafting, code iteration, interface refinement, and debugging. All AI-assisted outputs were reviewed, edited, and validated by the team, who remain responsible for the final analysis, design decisions, and implementation.

References

Clinical guidelines & disease assessment

The following paragraphs describe how each model output should be read against current rheumatology practice, and underpin the brief lines shown in the Suggested actions panel and the workflow steps in Clinical handoff. Lupus assessment in routine practice may also draw on the BILAG-2004 index, the 1997 ACR revised classification criteria, and organ-specific guidance such as the KDIGO 2024 lupus nephritis recommendations, all cited below.

Diagnosis. The expression profile is compared to the SLE reference cohort. A low signal does not rule out SLE when clinical suspicion remains; a high signal still requires confirmation against the 2019 EULAR/ACR classification criteria — ANA entry, organ domains, haematology, anti-dsDNA/anti-Sm, complement consumption — and exclusion of infection, drug-induced lupus, and overlap syndromes before labelling or initiating SLE-directed therapy (Aringer et al., 2019).

Progression. Interpret the next-visit activity probability against the SLEDAI-2K trend, anti-dsDNA, C3/C4, urinalysis ± urine protein-to-creatinine ratio (UPCR), and any new organ involvement. A rising probability with falling complement, rising anti-dsDNA, or an active urinary sediment supports a true flare and warrants earlier review; escalation should follow EULAR 2023 (continue hydroxychloroquine, minimise glucocorticoid exposure, add MMF/AZA/belimumab/anifrolumab where indicated). BILAG-2004 may be used in parallel where local practice prefers a domain-based activity index (Gladman et al., 2002; Fanouriakis et al., 2024; Yee et al., 2010).

Treatment response. Read the response probability alongside SLEDAI-2K, serology, and tolerability at the expected response timepoint. A low or off-target signal in a patient who is not at LLDAS/DORIS should prompt review of hydroxychloroquine adherence, glucocorticoid stewardship (aim ≤5 mg/day prednisolone-equivalent), and escalation per EULAR 2023 — adding or switching to belimumab, anifrolumab, MMF, or AZA as clinically indicated. Lupus-nephritis specifics follow the KDIGO 2024 guideline (Fanouriakis et al., 2024; Franklyn et al., 2016; van Vollenhoven et al., 2021; KDIGO, 2024).

Aringer, M., Costenbader, K., Daikh, D., Brinks, R., Mosca, M., Ramsey-Goldman, R., Smolen, J. S., Wofsy, D., Boumpas, D. T., Kamen, D. L., Jayne, D., Cervera, R., Costedoat-Chalumeau, N., Diamond, B., Gladman, D. D., Hahn, B., Hiepe, F., Jacobsen, S., Khanna, D., ... Johnson, S. R. (2019). 2019 European League Against Rheumatism/American College of Rheumatology classification criteria for systemic lupus erythematosus. Arthritis & Rheumatology, 71(9), 1400-1412. https://doi.org/10.1002/art.40930
Fanouriakis, A., Kostopoulou, M., Andersen, J., Aringer, M., Arnaud, L., Bae, S.-C., Boletis, J., Bruce, I. N., Cervera, R., Doria, A., Dörner, T., Furie, R. A., Gladman, D. D., Houssiau, F. A., Inês, L. S., Jayne, D., Kouloumas, M., Kovács, L., Mok, C. C., ... Boumpas, D. T. (2024). EULAR recommendations for the management of systemic lupus erythematosus: 2023 update. Annals of the Rheumatic Diseases, 83(1), 15-29. https://doi.org/10.1136/ard-2023-224762
Franklyn, K., Lau, C. S., Navarra, S. V., Louthrenoo, W., Lateef, A., Hamijoyo, L., Wahono, C. S., Chen, S. L., Jin, O., Morton, S., Hoi, A., Huq, M., Nikpour, M., & Morand, E. F. (2016). Definition and initial validation of a Lupus Low Disease Activity State (LLDAS). Annals of the Rheumatic Diseases, 75(9), 1615-1621. https://doi.org/10.1136/annrheumdis-2015-207726
Gladman, D. D., Ibañez, D., & Urowitz, M. B. (2002). Systemic Lupus Erythematosus Disease Activity Index 2000. The Journal of Rheumatology, 29(2), 288-291. https://www.jrheum.org/content/29/2/288
Kidney Disease: Improving Global Outcomes (KDIGO) Lupus Nephritis Work Group. (2024). KDIGO 2024 Clinical Practice Guideline for the Management of Lupus Nephritis. Kidney International, 105(1S), S1-S69. https://kdigo.org/guidelines/lupus-nephritis/
van Vollenhoven, R. F., Bertsias, G., Doria, A., Isenberg, D., Morand, E., Petri, M. A., Pons-Estel, B. A., Rahman, A., Ugarte-Gil, M. F., Voskuyl, A., Arnaud, L., Bruce, I. N., Cervera, R., Costedoat-Chalumeau, N., Gordon, C., Houssiau, F. A., Mosca, M., Schneider, M., Ward, M. M., ... Aringer, M. (2021). 2021 DORIS definition of remission in SLE: Final recommendations from an international task force. Lupus Science & Medicine, 8(1), e000538. https://doi.org/10.1136/lupus-2021-000538
Yee, C.-S., Cresswell, L., Farewell, V., Rahman, A., Teh, L.-S., Griffiths, B., Bruce, I. N., Ahmad, Y., Prabu, A., Akil, M., McHugh, N., D'Cruz, D., Khamashta, M. A., Maddison, P., Gordon, C., & Isenberg, D. A. (2010). Numerical scoring for the BILAG-2004 index. Rheumatology (Oxford), 49(9), 1665-1669. https://doi.org/10.1093/rheumatology/keq026

SLE cohorts, annotation & preprocessing

Banchereau, R., Hong, S., Cantarel, B., Baldwin, N., Baisch, J., Edens, M., Cepika, A.-M., Acs, P., Turner, J., Anguiano, E., Vinod, P., Kahn, S., Obermoser, G., Blankenship, D., Wakeland, E., Nassi, L., Gotte, A., Punaro, M., Liu, Y.-J., ... Pascual, V. (2016). Personalized immunomonitoring uncovers molecular networks that stratify lupus patients. Cell, 165(3), 551-565. https://doi.org/10.1016/j.cell.2016.03.008
Bedre, R. (2023). RNA-seq expression units: RPKM, FPKM, and TPM. https://www.reneshbedre.com/blog/expression_units.html
Brown, G. R., Hem, V., Katz, K. S., Ovetsky, M., Wallin, C., Ermolaeva, O., Tolstoy, I., Tatusova, T., Pruitt, K. D., Maglott, D. R., & Murphy, T. D. (2015). Gene: A gene-centered information resource at NCBI. Nucleic Acids Research, 43(D1), D36-D42. https://doi.org/10.1093/nar/gku1055
Chiche, L., Jourde-Chiche, N., Whalen, E., Presnell, S., Gersuk, V., Dang, K., Anguiano, E., Quinn, C., Burtey, S., Berland, Y., Kaplanski, G., Harlé, J.-R., Pascual, V., & Chaussabel, D. (2014). Modular transcriptional repertoire analyses of adults with systemic lupus erythematosus reveal distinct type I and type II interferon signatures. Arthritis & Rheumatology, 66(6), 1583-1595. https://doi.org/10.1002/art.38628
Harrison, P. W., Amode, M. R., Austine-Orimoloye, O., Azov, A. G., Barba, M., Barnes, I., Becker, A., Bennett, R., Berry, A., Bhai, J., Bhurji, S. K., Boddu, S., Branco Lins, P. R., Brooks, L., Ramaraju, S. B., Campbell, L. I., Carbajo Martinez, M., Charkhchi, M., Chougule, K., ... Yates, A. D. (2024). Ensembl 2024. Nucleic Acids Research, 52(D1), D891-D899. https://doi.org/10.1093/nar/gkad1049
Hung, T., Pratt, G. A., Sundararaman, B., Townsend, M. J., Chaivorapol, C., Bhangale, T., Graham, R. R., Ortmann, W., Bhangale, T. R., Behrens, T. W., Yeo, G. W., & Chaussabel, D. (2015). The Ro60 autoantigen binds endogenous retroelements and regulates inflammatory gene expression in systemic lupus erythematosus. NCBI Gene Expression Omnibus. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE72509
NCBI Gene Expression Omnibus. (2023). Whole-blood microarray expression in lupus nephritis: Treatment response by SRI-4. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE224705
National Institute of Standards and Technology. (n.d.). Standardize. Dataplot reference manual. Retrieved May 24, 2026, from https://www.itl.nist.gov/div898/software/dataplot/refman2/auxillar/standard.htm
Ritchie, M. E., Phipson, B., Wu, D., Hu, Y., Law, C. W., Shi, W., & Smyth, G. K. (2015). limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Research, 43(7), e47. https://doi.org/10.1093/nar/gkv007
Seal, R. L., Braschi, B., Gray, K. A., Jones, T. E. M., Tweedie, S., Haim-Vilmovsky, L., & Bruford, E. A. (2023). Genenames.org: The HGNC resources in 2023. Nucleic Acids Research, 51(D1), D1003-D1009. https://doi.org/10.1093/nar/gkac888
Smyth, G. K. (2004). Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Statistical Applications in Genetics and Molecular Biology, 3(1), 3. https://doi.org/10.2202/1544-6115.1027

Machine learning methods & evaluation

Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5-32. https://doi.org/10.1023/A:1010933404324
Brodersen, K. H., Ong, C. S., Stephan, K. E., & Buhmann, J. M. (2010). The balanced accuracy and its posterior distribution. In 2010 20th International Conference on Pattern Recognition (pp. 3121-3124). IEEE. https://doi.org/10.1109/ICPR.2010.764
Chicco, D., & Jurman, G. (2020). The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics, 21, 6. https://doi.org/10.1186/s12864-019-6413-7
Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273-297. https://doi.org/10.1007/BF00994018
Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters, 27(8), 861-874. https://doi.org/10.1016/j.patrec.2005.10.010
Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. The Annals of Statistics, 29(5), 1189-1232. https://doi.org/10.1214/aos/1013203451
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 33(1), 1-22. https://doi.org/10.18637/jss.v033.i01
Hosmer, D. W., Jr., Lemeshow, S., & Sturdivant, R. X. (2013). Applied logistic regression (3rd ed.). Wiley. https://doi.org/10.1002/9781118548387
Kursa, M. B., & Rudnicki, W. R. (2010). Feature selection with the Boruta package. Journal of Statistical Software, 36(11), 1-13. https://doi.org/10.18637/jss.v036.i11
Liaw, A., & Wiener, M. (2002). Classification and regression by randomForest. R News, 2(3), 18-22. https://journal.r-project.org/articles/RN-2002-022/
Matthews, B. W. (1975). Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochimica et Biophysica Acta (BBA) - Protein Structure, 405(2), 442-451. https://doi.org/10.1016/0005-2795(75)90109-9
Sokolova, M., & Lapalme, G. (2009). A systematic analysis of performance measures for classification tasks. Information Processing & Management, 45(4), 427-437. https://doi.org/10.1016/j.ipm.2009.03.002
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1), 267-288. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x
Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(2), 301-320. https://doi.org/10.1111/j.1467-9868.2005.00503.x

Deployment & infrastructure

The console is served as a static frontend backed by an R Plumber API that loads the saved models and returns a prediction per stage (Schloerke & Allen, 2024); the backend is containerised and hosted on Render (Render Inc., 2024).

Render Inc. (2024). Render: Cloud application hosting for developers. Retrieved May 30, 2026, from https://render.com
Schloerke, B., & Allen, J. (2024). plumber: An API generator for R (Version 1.2.2) [Computer software]. https://www.rplumber.io

Design & visualisation

The interface styling draws on the University of Sydney brand guidance for typography and colour. The vision controls and extra labels follow accessibility guidance for colour-blind users by avoiding colour as the only way to convey meaning (The University of Sydney, 2023; W3C Web Accessibility Initiative, n.d.). Mathematical formulae such as the standardisation expression on the Methods tab are typeset with KaTeX (The KaTeX Project, 2024).

Apple Inc. (n.d.). Swift Charts. Apple Developer Documentation. Retrieved May 23, 2026, from https://developer.apple.com/documentation/charts
The KaTeX Project. (2024). KaTeX (Version 0.16.11) [Computer software]. https://katex.org/
The University of Sydney. (2023). Brand guidelines 2023.
W3C Web Accessibility Initiative. (n.d.). Designing for web accessibility: Tips for getting started. Retrieved May 27, 2026, from https://www.w3.org/WAI/tips/designing/