
Biostatistics

Biostatistics
We will apply state-of-the-art statistical methods to build prognostic models for Alcohol-related hepatitis mortality and infection from the rich biomarker data collected in this project.
Since many of the biomarkers are likely to be highly correlated, advanced statistical modelling will be required to identify which subsets have the highest prognostic value. “Sparse regression” is a technique that allows exploration of a large number of candidate predictors, while accounting for their correlations, in order to identify a ‘sparse’ subset of the most important predictors. In particular, we will apply Bayesian sparse regression, which allows the incorporation of prior information on biomarker associations with mortality and infection from previous studies, such as the STOPAH trial. Throughout this work we will work closely with the metabolomics workstrand to incorporate metabolomic biomarkers identified through the metabonome-wide association study.
To validate and assess our models, we will take advantage of the multi-centre design of MIMAH. We will exclude random subsets of centres from model building such that they can be used as truly independent testing data. We will also validate any applicable sub-models using data from the STOPAH trial, and will seek out independent cohorts where prognostic models could potentially be further validated. When assessing our models against existing scoring systems, we will particularly focus on their ability to identify high risk patients, who might benefit from treatment or liver transplantation, as well as low risk patients, who could be discharged early. To this end we plan to use decision-theoretic frameworks such as “Net Benefits”, which explicitly reflect clinical consequences in the calculation of predictive error, and are being increasingly used in the evaluation of prognostic models.