Skip to content

Latest commit

 

History

History
58 lines (30 loc) · 1.5 KB

README.md

File metadata and controls

58 lines (30 loc) · 1.5 KB

IDEARS - Integrated Disease Explanation and Associations Risk Scoring

Applies to the UKB datasetes, UKB dementia, AD and PD classification and SHAP

Overview

This is the codebase for IDEARs - Integrated Disease Explanation and Associations Risk Scoring. Its overall architecture is shown below.

Markdown Monster icon

How to Run

To ease the configuation, please install Anaconda and set this up in a virtual environment.

  1. Install Anaconda:

https://www.anaconda.com/products/individual

  1. Create the environment:

conda env create -f .\conda-env.yml

  1. Acticate the environment:

conda activate conda-env

Then on Windows, run startlocal_woDocker.bat and on Linux, run startlocal_woDocker.sh

Codebase Structure

  • data_gen.py is used to perform ETL on the data and to create the model datasets
  • data_proc.py is used for extra data processing including the creation of normalised datasets
  • ml.py is used to run the models including logistic regression, XGBoost and for model interpretability using SHAP
  • analysis.py is used to create charts, perform extra statistical tests including paired t tests

The jupyter notebooks used for AD are:

  • AD_ml_part_1.ipynb
  • Master_ml.ipynb

Overview

Import modules etc.

Directory Tree and Explanations

This folder shows the implementation of the IDEARs platform.

Enquiries

Michael Allwright - [email protected]