Sjogren's disease (SD) is an autoimmune disease that suffers from diagnosis delay. We used claims data from Optum's de-identified Clinformatics(R) Data Mart Database to predict SD, identify factors that could indicate the early diagnosis of SD, and characterize SD patients. Demographics, comorbidities, medications, and serological test data were fit to LASSO, random forest, and XGBoost models. Latent class analysis (LCA) was then used to model a subset of important predictors. We identified 5,632 SD cases and 56,320 non-SD controls. Joint pain, comorbid autoimmune diseases, abnormal serological tests, and immune suppressant medication were found to be strongly predictive of SD. The LASSO model had an AUC of 0.80 (0.78-0.81) with other models performing similarly. LCA identified four groups of SD cases, characterized by abnormal serological tests and low to moderate disease burden. The largest class, representing 66.0% of the sample, was diagnosed with SD with little evidence as to why they were diagnosed. Our prediction models show potential for predicting SD with routinely collected healthcare data but must account for imbalanced data and other limitations.
No clinical trial protocols linked to this paper
Clinical trials are automatically linked when NCT numbers are found in the paper's title or abstract.PICO Elements
No PICO elements extracted yet. Click "Extract PICO" to analyze this paper.
Paper Details
MeSH Terms
Associated Data
No associated datasets or code repositories found for this paper.
Related Papers
Related paper suggestions will be available in future updates.