Subtype and Stage Inference, or SuStaIn, is an algorithm for discovery of data-driven groups or "subtypes" in chronic disorders. This repository is the Python implementation of SuStaIn, with the option to describe the subtype progression patterns using either the event-based model or the piecewise linear z-score model.
If you use pySuStaIn, please cite the following core papers:
Please also cite the corresponding progression pattern model you use:
- The piecewise linear z-score model (i.e. ZscoreSustain)
- The event-based model (i.e. MixtureSustain) with Gaussian mixture modelling or kernel density estimation).
Thanks a lot for supporting this project.
pip install git+https://github.com/ucl-pond/pySuStaIn
In main pySuStaIn directory (where you see setup.py, README.txt, LICENSE.txt and all subfolders), run:
pip install .
This will install everything listed in requirements.txt
, including the awkde package (used for mixture modelling). During the installation of awkde
, an error may appear, but then the installation should continue and be successful. Note that you need pip
version 18.1+ for this installation to work.
If the above install breaks, you may have some interfering packages installed. One way around this would be to create a new Anaconda environment that uses Python 3.7, then activate it and repeat the installation steps above. To do this, download and install Anaconda, then run:
conda create --name sustain_env python=3.7
conda activate sustain_env
To create an environment named sustain_env
.
- Python >= 3.7
- NumPy >= 1.18
- SciPy
- Matplotlib
- Scikit-learn for cross-validation
- kde_ebm for mixture modelling (KDE and GMM included)
- pathos for parallelization
- awkde for KDE mixture modelling
- Added parallelized startpoints
sustainType can be set to:
mixture_GMM
: SuStaIn with an event-based model progression pattern, with Gaussian mixture modelling of normal/abnormal.mixture_KDE
: SuStaIn with an event-based model progression pattern, with Kernel Density Estimation (KDE) mixture modelling of normal/abnormal.zscore
: SuStaIn with a piecewise linear z-score model progression pattern.
See simrun.py
for examples of how to run these different implementations.
See the jupyter notebook in the notebooks folder for a tutorial on how to use SuStaIn using simulated data.
Methods:
- The SuStaIn algorithm: Young et al. 2018
- The pySuStaIn software paper: Aksman, Wijeratne et al. 2021
- The event-based model: Fonteijn et al. 2012, (with Gaussian mixture modelling Young et al. 2014 or non-parametric kernel density estimation Firth et al. 2020)
- The piecewise linear z-score model: Young et al. 2018
Applications:
- Multiple sclerosis (predicting treatment response): Eshaghi et al. 2021. The trained model is available here.
- Tau PET data in Alzheimer's disease: Vogel et al. 2021
- COPD: Young and Bragman et al. 2020
This project has received funding from the European Union’s Horizon 2020 Research and Innovation Programme under Grant Agreements 666992. Application of SuStaIn to multiple sclerosis was supported by the International Progressive MS Alliance (IPMSA, award reference number PA-1603-08175).
(The authors) have also persuaded me that (SuStaIn is) as clever as e.g. Heiko Braak's brain, (and) can infer longitudinal trajectories based on cross-sectional observations.
- Anonymous reviewer