Training courses

Statistical Methods for Risk Prediction and Prognostic Models (Birmingham, online, 4th - 6th March 2025, book here)

Prognosis Research in Healthcare Summer School (Keele, online, 17th - 19th July 2024)

Statistical Methods for Meta-Analysis of Individual Participant Data (Birmingham, online, 2nd - 4th October 2024)

Systematic Reviews of Prognosis Studies (Utrecht)

Systematic Reviews of Prognosis Studies - free 60 mins interactive module (Cochrane Prognosis Methods Group)

Clinical Prediction Models (Maastricht)

Regression Modelling Strategies (online)

Statistical Methods for Risk Prediction and Prognostic Models

University of Birmingham

4th - 6th March 2025 (online course)

Book here

This online course provides a thorough foundation of statistical methods for developing and validating prognostic models in clinical research. The course is delivered over 3 days and focuses on model development (day 1), internal validation (day 2), and external validation and novel topics (day 3). Our focus is on multivariable models for individualised prediction of future outcomes (prognosis), although many of the concepts described also apply to models for predicting existing disease (diagnosis).

Computer practicals in either R or Stata are included on all three days (two per day), and participants can choose whether to focus on logistic regression examples (for binary outcomes) or Cox / flexible parametric survival examples (for time-to-event outcomes), to tailor the practicals to their own purpose. All code is already written and so participants can focus more on their understanding of methods and interpretation of results.

The course is aimed at individuals that want to learn how to develop and validate risk prediction and prognostic models, specifically for binary or time-to-event clinical outcomes (though continuous outcomes is also covered). We recommend participants have a good understanding of key statistical principles and measures (such as effect estimates, confidence intervals and p-values) and the ability to apply and interpret regression models is essential. A background in statistics, epidemiology or data science would also be advantageous.

Day 1 begins with an overview of the rationale and phases of prediction model research. It then outlines model specification, focusing on logistic regression for binary outcomes and Cox regression or flexible parametric survival models for time to event outcomes. Model development topics are then covered, including: identifying candidate predictors, handling of missing data, modelling continuous predictors using fractional polynomials or restricted cubic splines for non-linear functions, and variable selection procedures.

Day 2 focuses on how models are overfitted for the data in which they were derived, and thus often do not generalise to other datasets. Internal validation strategies are outlined to identify and adjust for overfitting. In particular cross-validation and bootstrapping are covered to estimate the optimism and shrink the model coefficients accordingly; related approaches such as LASSO and elastic net are also discussed. Statistical measures of model performance are introduced for discrimination (such as the C-statistic and D-statistic) and calibration (calibration-in-the-large, calibration plots, calibration slope, calibration curve). With all this knowledge, we then discuss sample size considerations for model development and validation, and new software to implement sample size calculations.

Day 3 focuses on the need for model performance to be evaluated in new data to assess its generalisability, namely external validation. A framework for different types of external validation studies is provided, and the potential importance of model updating strategies (such as re-calibration techniques) are considered. Novel topics are then considered, including: the use of pseudo-values to allow calibration curves in a survival model setting; the development and validation of models using large datasets (e.g. from e-health records) or multiple studies; the use of meta-analysis methods for summarising the performance of models across multiple studies or clusters; the role of net benefit and decision curve analysis to understand the potential role of a model for clinical decision making; and practical guidance about different ways in which prediction and prognostic models can be presented.

The course will be run online over three days using a combination of recorded lecture videos, computer practical exercises in Stata and R for participants to work through, and live question and answer sessions following each lecture/session. There will also be opportunities to meet with the faculty to ask specific questions related to personal research queries and problems.

Ideally participants should undertake the course live (9am to 5pm UK time), but all course material (e.g. lecture videos, computer practicals etc) will be made available about 5 days in advance and for 2 weeks afterwards, to provide plenty of time and flexibility for participants to work through the material in their own time.

The following prices are available:

Student - £499

Academic - £599

Industry - £699

Faculty includes Kym Snell (lead), Lucinda Archer, Joie Ensor, Richard Riley, Laura Bonnett, Gary Collins and Paula Dhiman.

Registration now open for 4th - 6th March 2025

Please email Kym Snell for any course enquiries or to join the waiting list for future dates.

Prognosis research in healthcare: concepts, methods and impact

International Summer School, Keele University

(Online Course, 2025)

This 3-day online summer school is designed to introduce the key components and uses of prognosis research to health professionals and researchers, including:

a framework of four different prognosis research questions: overall prognosis, prognostic factors, prognostic models, and stratified medicine
key principles of study design and methods
interpretation of statistical results about prognosis
the use of prognosis research evidence at multiple stages on the translational pathway toward improving patient outcome
the limitations of current prognosis research, and how the field can be improved

The course consists of lectures from a core faculty of epidemiologists, statisticians and clinical researchers, alongside group work and discussion sessions. Please note that no computer practicals are included with the focus instead on interpretation of statistical concepts and results of analyses. Basic knowledge of epidemiology and statistics is assumed. The course is founded on the prognosis research framework introduced by the PROGRESS partnership, described in a series of 4 articles published in BMJ/PLoS Medicine in February 2013.

Faculty includes James Bailey (lead), Richard Riley, Danielle van der Windt, Joie Ensor, Lucinda Archer and Kym Snell

Dates for 2025 will be announced in due course

Statistical Methods for Meta-Analysis of Individual Participant Data

University of Birmingham

(online, 2025)

This three-day online course provides a detailed foundation of the methods and principles for meta-analysis when Individual Participant Data (IPD) are available from multiple related studies. The course considers continuous, binary and time-to-event outcomes, and covers a variety of modelling options, including fixed effect and random effects.

Days 1 and 2 mainly focus on the synthesis of IPD from randomised trials of interventions, where the aim is to summarise a treatment effect or to examine treatment-covariate interactions. We outline how to use either a two-stage framework (day 1) or a one-stage framework (day 2) for the meta-analysis, and compare their pros and cons. Day 3 focuses on novel extensions including multivariate and network meta-analysis of IPD to incorporate correlated and indirect evidence (e.g. from multiple outcomes or multiple treatment comparisons). Special topics will also be covered, including: (i) IPD meta-analysis to identify prognostic/risk factors, (ii) IPD meta-analysis of test accuracy studies; (iii) estimating the power of a planned IPD meta-analysis; and (iv) dealing with unavailable IPD. The course consists of a mixture of pre-recorded lectures, followed by practical sessions and live Q&A sessions to reinforce the underlying statistical concepts. Participants can choose either Stata or R for the practicals. The key messages are illustrated with real examples throughout the course

The course is aimed at individuals that want to learn how to plan and undertake an IPD meta-analysis. We recommend that participants have a background in statistics as the course assumes a good understanding of core statistical principles and topics, such as regression methods (such as linear, logistic, and Cox), parameter estimation and interpreting software output. A familiarity with traditional aggregate data (non-IPD) meta-analysis methods is advantageous, though not essential. We also recommend that participants are familiar with Stata or R, although the practicals will not require individuals to write their own code. Participants must have access to their own copy of R or Stata for undertaking the practicals.

The course will be run online over three days using a combination of recorded lecture videos, computer practical exercises in Stata and R for participants to work through, and live question and answer sessions following each lecture/session. All code is already written and so participants can focus more on their understanding of methods and interpretation of results. There will also be opportunities to meet with the faculty to ask specific questions related to personal research queries and problems.

Ideally participants should undertake the course live (9am to 5pm UK time), but all course material (e.g. lecture videos and computer practicals) will be made available about 5 days in advance and for 2 weeks afterwards, to provide plenty of time and flexibility for participants to work through the material in their own time.

The following prices are available:

Student - £499

Academic - £599

Industry - £699

Dates for 2025 will be announced in due course

For enquiries please contact Prof Richard Riley.

Systematic Reviews of Prognosis Studies

(to be announced for 2025)

These courses explain how to define your review question, how to search the literature, how to critically assess the methodological quality of primary prognosis studies, and which statistical methods to use for their synthesis.

At the end of the courses, you'll be able to:

Explain the rationale for performing a systematic review of prognostic studies
List the key steps of a systematic review of prognostic studies
Formulate a focused review question addressing a prognostic problem
Systematically search the literature
Critically appraise the evidence from primary prognostic studies
Formulate the difficulties of meta-analysis of prognostic research
Meta-analyses of performance of prognostic models
Meta-analyses of the added value of specific prognostic factors

For more info, click here

Free interactive 60 min module

Also, the Cochrane Prognosis Methods Group have launched an online 60-minute interactive module on Systematic review of prognosis studies that can be followed for **free** via the Cochrane Training website. The module is intended for researchers and clinicians with an interest in reading a systematic review of prognosis studies or as an introductory course for researchers planning to perform one.

Regression Modeling Strategies

4-day online course, 2025

Details here

Do you need

A statistical modeling tune-up or to learn about modern flexible methods for developing and validating predictive models?
To understand the advantages and disadvantages of machine learning relative to statistical models?
To know the importance of causal inference in formulating models?

The only full Regression Modeling Strategies 4-day course offered this year covers predictive models, validation, missing data, preserving information, measuring predictive accuracy, model specification, avoiding overfitting, the art of data analysis, comprehensive case studies, and more.

For more details of the course, click here

The RMS 4-day short course will be held as a virtual course in May 2024. This will be a very interactive live web course using Zoom with registration fees that are significantly reduced over the traditional yearly in-person course.

TAUGHT BY: Frank E. Harrell, Jr., Ph.D., Professor, Department of Biostatistics, Vanderbilt University School of Medicine, Instructor; and Drew G. Levy, Ph.D., GoodScience Inc., Guest Instructor and Moderator

TARGET AUDIENCE: Statisticians and other quantitative researchers who want to learn some general predictive model development strategies, including approaches to missing data imputation, data reduction, model specification and variable selection, model validation, relaxing linearity assumptions, and how to choose between machine learning and statistical models.

PRE-REQUISITES: Good working knowledge of ordinary multiple regression models. Some individuals will want to take the free Biostatistics for Biomedical Research course in preparation (especially sessions on regression).

Required pre-course study materials are here.

Risk prediction course

Summe school

IPD MA stats course

Systematic reviews Utrecht

CPMs Maastricht

Regression modelling