Import patient CSV

Feature coverage
Molecular fingerprint
Strong low Low Reference High Strong high
Key drivers
Low Intermediate High

Model outputs

Suggested actions

Clinical handoff

Biomarker explorer

Molecular fingerprint

Modelling workflow

01 Data collection

Three stage-specific cohorts were used so each model matched a distinct clinical question: diagnosis, next-visit activity, and treatment response.

  • Diagnosis: GSE72509, retrieved from NCBI GEO (Hung et al., 2015); whole-blood RNA-seq with 99 SLE and 18 control samples.
  • Progression: GSE65391, retrieved from NCBI GEO, for training (Banchereau et al., 2016); GSE49454, retrieved from NCBI GEO, for external validation (Chiche et al., 2014). Activity was defined as SLEDAI ≥ 6.
  • Treatment: GSE224705, retrieved from NCBI GEO (NCBI Gene Expression Omnibus, 2023); first-visit SLE samples with response derived from SRI-4 labels after removing healthy controls.

02 Pre-processing and EDA

Expression and metadata were cleaned before modelling. Diagnosis RPKM values were transformed as log2(RPKM + 1); treatment microarray values were filtered for annotated, non-control probes and collapsed to one probe per gene using NCBI Gene, HGNC, and Ensembl annotations (Bedre, 2023; Brown et al., 2015; Seal et al., 2023; Harrison et al., 2024).

  • Checked sample-level expression distributions and outcome balance.
  • Derived longitudinal t-to-t+1 labels for progression.
  • Standardised or imputed model inputs inside training workflows where required (NIST, n.d.).

03 Feature selection

Feature selection was performed before final modelling to reduce high-dimensional expression matrices to interpretable panels and to avoid leakage from external or held-out data.

  • Diagnosis and treatment pre-filtered to highly variable expressed genes, then combined RF-Gini ranking, PCA checks, Boruta, and biological curation (Breiman, 2001; Kursa & Rudnicki, 2010; Liaw & Wiener, 2002).
  • Progression selected expression probes from the training cohort only, removed direct leakage variables, then combined immune, clinical, treatment, temporal, engineered, and gene features.
  • Final progression features were ranked with random-forest importance on training data only.

04 Modelling

Candidate models were compared within each stage rather than forcing one algorithm across all tasks. The deployed app reads the selected R model objects and sends the frontend CSV payload to the backend for inference; probability displays are summarised in Predicted probability bands.

Diagnosis compared limma signature, random forest, and LASSO. Progression compared elastic net, random forest, and GBM. Treatment response compared limma signature, random forest, LASSO, elastic net, and linear SVM. See Validation metrics below.

05 Performance evaluation

Evaluation prioritised metrics that are more informative than raw accuracy for imbalanced clinical cohorts. Diagnosis and treatment used stratified 5-fold cross-validation; progression used patient-level cross-validation for tuning and one independent external test on GSE49454 (Fawcett, 2006; Brodersen et al., 2010). See Validation metrics and Metric notes below.

Validation metrics

Outcome Model Validation AUROC Macro F1 Balanced acc. Accuracy MCC
Limited < 0.70 Adequate 0.70-0.84 Strong ≥ 0.85

Metric notes

  • AUROC summarises threshold-free ranking of positive cases above negative cases (Fawcett, 2006).
  • Macro F1 averages class-wise F1 scores so minority and majority classes contribute evenly, while balanced accuracy averages class-wise recall to reduce majority-class bias (Sokolova & Lapalme, 2009; Brodersen et al., 2010).
  • Accuracy is the overall proportion of correct predictions and can look optimistic in imbalanced cohorts; MCC is a correlation-like summary of the full confusion matrix, where +1 is perfect agreement and 0 is no better than chance (Matthews, 1975; Chicco & Jurman, 2020).
  • The legend bands follow common discrimination heuristics: values below 0.70 are treated as limited, 0.70-0.84 as adequate, and 0.85 or higher as strong. These are visual interpretation bands, not clinical acceptance thresholds (Hosmer et al., 2013).

Predicted probability bands

Biomed 09 - Team Members

Manna Berry Development of Progression Model & Assistance with Backend
mber0347@uni.sydney.edu.au Faculty of Engineering J12, The University of Sydney, NSW 2006
Lezhi Lin Development of App Frontend & Presentation Slides
llin0935@uni.sydney.edu.au School of Mathematics and Statistics F07, The University of Sydney, NSW 2006 Australia
Udit Samant Development of Diagnosis Model & General App Backend
usam6049@uni.sydney.edu.au School of Computer Science J12, The University of Sydney, NSW 2006 Australia
Hadi Shafat Interdisciplinary Aspects Research& Assistance with Backend
hsha0153@uni.sydney.edu.au School of Computer Science J12, The University of Sydney, NSW 2006 Australia
Minh Hieu Tran Assistance with Initial Data Analysis
mtra0191@uni.sydney.edu.au School of Computer Science J12, The University of Sydney, NSW 2006 Australia
Jillian Zhao Development of Treatment Model & Assistance with Backend & Background Research
yzha0369@uni.sydney.edu.au School of Computer Science J12, The University of Sydney, NSW 2006 Australia

Acknowledgment

We acknowledge the Gadigal of the Eora Nation, the Traditional Custodians of the land on which the University of Sydney stands, and pay our respects to Elders past and present.

This prototype is submitted in partial fulfillment of the assessment requirements for DATA3888 Data Science Capstone at The University of Sydney. Our work also rests on the work of open-source maintainers across R, Bioconductor, and the modelling libraries used here, as well as the DATA3888 teaching team for project structure, feedback, and course support.

We're extremely grateful to our supervisors, Dr. Andy Tran and Elyna Lin, for all the guidance, thoughtful feedback, and steady support during both the workshops and consultations, throughout the project.

We acknowledge the original data contributors and study participants behind the public GEO cohorts. Their shared expression and clinical metadata made the modelling, validation, and patient-level demonstrations possible.

We acknowledge the use of AI-assisted tools to support drafting, code iteration, interface refinement, and debugging. All AI-assisted outputs were reviewed, edited, and validated by the team, who remain responsible for the final analysis, design decisions, and implementation.

References

Clinical guidelines & disease assessment

The clinical-decision-support language in this prototype is anchored to the citations in this section. The following paragraphs describe how each model output should be read against current rheumatology practice, and underpin the brief lines shown in the Suggested actions panel and the workflow steps in Clinical handoff. Lupus assessment in routine practice may also draw on the BILAG-2004 index, the 1997 ACR revised classification criteria, and organ-specific guidance such as the KDIGO 2024 lupus nephritis recommendations, all cited below.

Diagnosis. The expression profile is compared to the SLE reference cohort. A low signal does not rule out SLE when clinical suspicion remains; a high signal still requires confirmation against the 2019 EULAR/ACR classification criteria — ANA entry, organ domains, haematology, anti-dsDNA/anti-Sm, complement consumption — and exclusion of infection, drug-induced lupus, and overlap syndromes before labelling or initiating SLE-directed therapy (Aringer et al., 2019).

Progression. Interpret the next-visit activity probability against the SLEDAI-2K trend, anti-dsDNA, C3/C4, urinalysis ± urine protein-to-creatinine ratio (UPCR), and any new organ involvement. A rising probability with falling complement, rising anti-dsDNA, or an active urinary sediment supports a true flare and warrants earlier review; escalation should follow EULAR 2023 (continue hydroxychloroquine, minimise glucocorticoid exposure, add MMF/AZA/belimumab/anifrolumab where indicated). BILAG-2004 may be used in parallel where local practice prefers a domain-based activity index (Gladman et al., 2002; Fanouriakis et al., 2024; Yee et al., 2010).

Treatment response. Read the response probability alongside SLEDAI-2K, serology, and tolerability at the expected response timepoint. A low or off-target signal in a patient who is not at LLDAS/DORIS should prompt review of hydroxychloroquine adherence, glucocorticoid stewardship (aim ≤5 mg/day prednisolone-equivalent), and escalation per EULAR 2023 — adding or switching to belimumab, anifrolumab, MMF, or AZA as clinically indicated. Lupus-nephritis specifics follow the KDIGO 2024 guideline (Fanouriakis et al., 2024; Franklyn et al., 2016; van Vollenhoven et al., 2021; KDIGO, 2024).

  1. Aringer, M., Costenbader, K., Daikh, D., Brinks, R., Mosca, M., Ramsey-Goldman, R., Smolen, J. S., Wofsy, D., Boumpas, D. T., Kamen, D. L., Jayne, D., Cervera, R., Costedoat-Chalumeau, N., Diamond, B., Gladman, D. D., Hahn, B., Hiepe, F., Jacobsen, S., Khanna, D., ... Johnson, S. R. (2019). 2019 European League Against Rheumatism/American College of Rheumatology classification criteria for systemic lupus erythematosus. Arthritis & Rheumatology, 71(9), 1400-1412. https://doi.org/10.1002/art.40930
  2. Fanouriakis, A., Kostopoulou, M., Andersen, J., Aringer, M., Arnaud, L., Bae, S.-C., Boletis, J., Bruce, I. N., Cervera, R., Doria, A., Dörner, T., Furie, R. A., Gladman, D. D., Houssiau, F. A., Inês, L. S., Jayne, D., Kouloumas, M., Kovács, L., Mok, C. C., ... Boumpas, D. T. (2024). EULAR recommendations for the management of systemic lupus erythematosus: 2023 update. Annals of the Rheumatic Diseases, 83(1), 15-29. https://doi.org/10.1136/ard-2023-224762
  3. Franklyn, K., Lau, C. S., Navarra, S. V., Louthrenoo, W., Lateef, A., Hamijoyo, L., Wahono, C. S., Chen, S. L., Jin, O., Morton, S., Hoi, A., Huq, M., Nikpour, M., & Morand, E. F. (2016). Definition and initial validation of a Lupus Low Disease Activity State (LLDAS). Annals of the Rheumatic Diseases, 75(9), 1615-1621. https://doi.org/10.1136/annrheumdis-2015-207726
  4. Gladman, D. D., Ibañez, D., & Urowitz, M. B. (2002). Systemic Lupus Erythematosus Disease Activity Index 2000. The Journal of Rheumatology, 29(2), 288-291. https://www.jrheum.org/content/29/2/288
  5. Kidney Disease: Improving Global Outcomes (KDIGO) Lupus Nephritis Work Group. (2024). KDIGO 2024 Clinical Practice Guideline for the Management of Lupus Nephritis. Kidney International, 105(1S), S1-S69. https://kdigo.org/guidelines/lupus-nephritis/
  6. van Vollenhoven, R. F., Bertsias, G., Doria, A., Isenberg, D., Morand, E., Petri, M. A., Pons-Estel, B. A., Rahman, A., Ugarte-Gil, M. F., Voskuyl, A., Arnaud, L., Bruce, I. N., Cervera, R., Costedoat-Chalumeau, N., Gordon, C., Houssiau, F. A., Mosca, M., Schneider, M., Ward, M. M., ... Aringer, M. (2021). 2021 DORIS definition of remission in SLE: Final recommendations from an international task force. Lupus Science & Medicine, 8(1), e000538. https://doi.org/10.1136/lupus-2021-000538
  7. Yee, C.-S., Cresswell, L., Farewell, V., Rahman, A., Teh, L.-S., Griffiths, B., Bruce, I. N., Ahmad, Y., Prabu, A., Akil, M., McHugh, N., D'Cruz, D., Khamashta, M. A., Maddison, P., Gordon, C., & Isenberg, D. A. (2010). Numerical scoring for the BILAG-2004 index. Rheumatology (Oxford), 49(9), 1665-1669. https://doi.org/10.1093/rheumatology/keq026
SLE cohorts, annotation & preprocessing
  1. Banchereau, R., Hong, S., Cantarel, B., Baldwin, N., Baisch, J., Edens, M., Cepika, A.-M., Acs, P., Turner, J., Anguiano, E., Vinod, P., Kahn, S., Obermoser, G., Blankenship, D., Wakeland, E., Nassi, L., Gotte, A., Punaro, M., Liu, Y.-J., ... Pascual, V. (2016). Personalized immunomonitoring uncovers molecular networks that stratify lupus patients. Cell, 165(3), 551-565. https://doi.org/10.1016/j.cell.2016.03.008
  2. Bedre, R. (2023). RNA-seq expression units: RPKM, FPKM, and TPM. https://www.reneshbedre.com/blog/expression_units.html
  3. Brown, G. R., Hem, V., Katz, K. S., Ovetsky, M., Wallin, C., Ermolaeva, O., Tolstoy, I., Tatusova, T., Pruitt, K. D., Maglott, D. R., & Murphy, T. D. (2015). Gene: A gene-centered information resource at NCBI. Nucleic Acids Research, 43(D1), D36-D42. https://doi.org/10.1093/nar/gku1055
  4. Chiche, L., Jourde-Chiche, N., Whalen, E., Presnell, S., Gersuk, V., Dang, K., Anguiano, E., Quinn, C., Burtey, S., Berland, Y., Kaplanski, G., Harlé, J.-R., Pascual, V., & Chaussabel, D. (2014). Modular transcriptional repertoire analyses of adults with systemic lupus erythematosus reveal distinct type I and type II interferon signatures. Arthritis & Rheumatology, 66(6), 1583-1595. https://doi.org/10.1002/art.38628
  5. Harrison, P. W., Amode, M. R., Austine-Orimoloye, O., Azov, A. G., Barba, M., Barnes, I., Becker, A., Bennett, R., Berry, A., Bhai, J., Bhurji, S. K., Boddu, S., Branco Lins, P. R., Brooks, L., Ramaraju, S. B., Campbell, L. I., Carbajo Martinez, M., Charkhchi, M., Chougule, K., ... Yates, A. D. (2024). Ensembl 2024. Nucleic Acids Research, 52(D1), D891-D899. https://doi.org/10.1093/nar/gkad1049
  6. Hung, T., Pratt, G. A., Sundararaman, B., Townsend, M. J., Chaivorapol, C., Bhangale, T., Graham, R. R., Ortmann, W., Bhangale, T. R., Behrens, T. W., Yeo, G. W., & Chaussabel, D. (2015). The Ro60 autoantigen binds endogenous retroelements and regulates inflammatory gene expression in systemic lupus erythematosus. NCBI Gene Expression Omnibus. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE72509
  7. NCBI Gene Expression Omnibus. (2023). Whole-blood microarray expression in lupus nephritis: Treatment response by SRI-4. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE224705
  8. National Institute of Standards and Technology. (n.d.). Standardize. Dataplot reference manual. Retrieved May 24, 2026, from https://www.itl.nist.gov/div898/software/dataplot/refman2/auxillar/standard.htm
  9. Ritchie, M. E., Phipson, B., Wu, D., Hu, Y., Law, C. W., Shi, W., & Smyth, G. K. (2015). limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Research, 43(7), e47. https://doi.org/10.1093/nar/gkv007
  10. Seal, R. L., Braschi, B., Gray, K. A., Jones, T. E. M., Tweedie, S., Haim-Vilmovsky, L., & Bruford, E. A. (2023). Genenames.org: The HGNC resources in 2023. Nucleic Acids Research, 51(D1), D1003-D1009. https://doi.org/10.1093/nar/gkac888
  11. Smyth, G. K. (2004). Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Statistical Applications in Genetics and Molecular Biology, 3(1), 3. https://doi.org/10.2202/1544-6115.1027
Machine learning methods & evaluation
  1. Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5-32. https://doi.org/10.1023/A:1010933404324
  2. Brodersen, K. H., Ong, C. S., Stephan, K. E., & Buhmann, J. M. (2010). The balanced accuracy and its posterior distribution. In 2010 20th International Conference on Pattern Recognition (pp. 3121-3124). IEEE. https://doi.org/10.1109/ICPR.2010.764
  3. Chicco, D., & Jurman, G. (2020). The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics, 21, 6. https://doi.org/10.1186/s12864-019-6413-7
  4. Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273-297. https://doi.org/10.1007/BF00994018
  5. Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters, 27(8), 861-874. https://doi.org/10.1016/j.patrec.2005.10.010
  6. Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. The Annals of Statistics, 29(5), 1189-1232. https://doi.org/10.1214/aos/1013203451
  7. Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 33(1), 1-22. https://doi.org/10.18637/jss.v033.i01
  8. Hosmer, D. W., Jr., Lemeshow, S., & Sturdivant, R. X. (2013). Applied logistic regression (3rd ed.). Wiley. https://doi.org/10.1002/9781118548387
  9. Kursa, M. B., & Rudnicki, W. R. (2010). Feature selection with the Boruta package. Journal of Statistical Software, 36(11), 1-13. https://doi.org/10.18637/jss.v036.i11
  10. Liaw, A., & Wiener, M. (2002). Classification and regression by randomForest. R News, 2(3), 18-22. https://journal.r-project.org/articles/RN-2002-022/
  11. Matthews, B. W. (1975). Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochimica et Biophysica Acta (BBA) - Protein Structure, 405(2), 442-451. https://doi.org/10.1016/0005-2795(75)90109-9
  12. Sokolova, M., & Lapalme, G. (2009). A systematic analysis of performance measures for classification tasks. Information Processing & Management, 45(4), 427-437. https://doi.org/10.1016/j.ipm.2009.03.002
  13. Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1), 267-288. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x
  14. Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(2), 301-320. https://doi.org/10.1111/j.1467-9868.2005.00503.x
Design & visualisation

The interface styling draws on the University of Sydney brand guidance for typography, colour, and institutional tone. The vision controls and redundant labels follow accessibility guidance for colour-blind users by avoiding colour as the only way to convey meaning (The University of Sydney, 2023; W3C Web Accessibility Initiative, n.d.).

  1. Apple Inc. (n.d.). Swift Charts. Apple Developer Documentation. Retrieved May 23, 2026, from https://developer.apple.com/documentation/charts
  2. The University of Sydney. (2023). Brand guidelines 2023.
  3. W3C Web Accessibility Initiative. (n.d.). Designing for web accessibility: Tips for getting started. Retrieved May 27, 2026, from https://www.w3.org/WAI/tips/designing/