PhyloDeep

PhyloDeep is a python library for parameter estimation and model selection from phylogenetic trees, based on deep learning.

Article

Voznica J, Zhukova A, Boskova V, Saulnier E, Lemoine F, Moslonka-Lefebvre M, Gascuel O. Deep learning from phylogenies to uncover the transmission dynamics of epidemics. Nat Commun 13, 3.96 (2022)

Installation

Together with the phylodeep package code (in the folder phylodeep), we provide:

all data shown in the preprint (in the folder data_publication)
used simulators to train deep learners (in the folder simulators)
the tree analyzed in the study as a showcase application (in the folder test_tree_HIV_Zurich, for description and original reference see below).

The data are extensive (including 50.000 testing trees), we thus do not recommend copying the whole repository.

The installation time of the package can be up to several minutes, including downloading dependencies. The run time should be a couple of seconds. The package was tested in Linux (Ubuntu 18.08), Windows 10 and MacOS.

PhyloDeep is available for Python>=3.8 on pip (see the installation instructions below).

Alternatively, it can be used via docker or apptainer: evolbioinfo/phylodeep.

Installation with Python3 on Windows

For Windows users, we recommend installing phylodeep via Cygwin environment. First install Python>=3.8 and pip3 from the Cygwin packages. Then install phylodeep:

pip3 install phylodeep

Installation with Python3 on all other platforms

You can install phylodeep for Python (version 3.8 or higher) with or without conda, following the procedures described below:

Installing with conda

Once you have conda installed, create an environment for phylodeep with Python>=3.9 (here we name it phyloenv):

conda create --name phyloenv python=3.8

Then activate it:

conda activate phyloenv

Then install phylodeep in it:

pip install phylodeep

Installing without conda

Make sure that Python>=3.8 and pip3 are installed, then install phylodeep:

pip3 install phylodeep

Usage

If you installed phylodeep with conda, do not forget to activate the corresponding environment (e.g. phyloenv) before using PhyloDeep:

conda activate phyloenv

We recommend to perform a priori model adequacy first to assess whether the input data resembles well the simulations on which the neural networks were trained.

Example data

Here, we use an HIV tree reconstructed from 200 sequences, published in "Phylodynamics on local sexual contact networks" by Rasmussen et al. [PLoS Comput. Biol. 2017], which you can find at PairTree GitHub and in test_tree_HIV_Zurich/Zurich.trees.

Python

from phylodeep import BD, BDEI, BDSS, FULL
from phylodeep.checkdeep import checkdeep
from phylodeep.modeldeep import modeldeep
from phylodeep.paramdeep import paramdeep


path_to_tree = './Zurich.trees'

# set presumed sampling probability
sampling_proba = 0.25

# a priori check for models BD, BDEI, BDSS
checkdeep(path_to_tree, model=BD, outputfile_png='BD_a_priori_check.png')
checkdeep(path_to_tree, model=BDEI, outputfile_png='BDEI_a_priori_check.png')
checkdeep(path_to_tree, model=BDSS, outputfile_png='BDSS_a_priori_check.png')


# model selection
model_BDEI_vs_BD_vs_BDSS = modeldeep(path_to_tree, sampling_proba, vector_representation=FULL)

# the selected model is BDSS

# parameter inference
param_BDSS = paramdeep(path_to_tree, sampling_proba, model=BDSS, vector_representation=FULL, 
                                 ci_computation=True)

# for the interpretation of results, please see below

Command line

# we use here a tree of 200 tips

# a priori model adequacy check: highly recommended
checkdeep -t ./Zurich.trees -m BD -o BD_model_adequacy.png
checkdeep -t ./Zurich.trees -m BDEI -o BDEI_model_adequacy.png
checkdeep -t ./Zurich.trees -m BDSS -o BDSS_model_adequacy.png

# model selection
modeldeep -t ./Zurich.trees -p 0.25 -v CNN_FULL_TREE -o model_selection.csv

# parameter inference
paramdeep -t ./Zurich.trees -p 0.25 -m BDSS -v FFNN_SUMSTATS -o HIV_Zurich_BDSS_FFNN.csv
paramdeep -t ./Zurich.trees -p 0.25 -m BDSS -v CNN_FULL_TREE -o HIV_Zurich_BDSS_CNN_CI.csv -c

Apptainer

Once apptainer is installed, run the following command:

apptainer run docker://evolbioinfo/phylodeep

This will launch a terminal session within the container, in which you can run PhyloDeep following the instructions for the command line above.

Example of output and interpretations

The a priori model adequacy check results in the following figures:

BD model adequacy test

BDEI model adequacy test

BDSS model adequacy test

For the three models (BD, BDEI and BDSS), HIV tree datapoint (represented by a red star) is well inside the data cloud of simulations, where warm colors correspond to high density of simulations. The simulations and HIV tree datapoint were in the form of summary statistics prior to applying PCA. All three models thus pass the model adequacy check.

We then apply model selection using the full tree representation and obtain the following result:

Model	Probability BDEI	Probability BD	Probability BDSS
Predicted probability	0.00	0.00	1.00

The BDSS probability is by far the highest: it is the BDSS model that is confidently selected

Finally, under the selected model BDSS, we predict parameter values together with 95% CIs:

	R naught	Infectious period	X transmission	Superspreading fraction
predicted value	1.69	9.78	9.34	0.079
CI 2.5%	1.40	8.12	6.65	0.050
CI 97.5%	2.08	12.26	10	0.133

The point estimates for parameters that are no time related (R naught, X transmission and Superspreading fraction) are well inside the parameter ranges of simulations and thus seem valid (R naught between 1 and 5, x transmission between 3 and 10, superspreading fraction between 0.05 and 0.20).

The time related parameters (infectious and eventually incubation period for BDEI model) are in the same units as the branches of input tree, here in years (9.78 years). The covered parameter space for time related parameters is large due to internal rescaling of all input trees. It should apply to any tree.

Name		Name	Last commit message	Last commit date
Latest commit History 84 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
data_publication		data_publication
phylodeep		phylodeep
simulators		simulators
test_tree_HIV_Zurich		test_tree_HIV_Zurich
tests		tests
utils		utils
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
Manifest.in		Manifest.in
README.md		README.md
README_PYPI.md		README_PYPI.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PhyloDeep

Article

Installation

Installation with Python3 on Windows

Installation with Python3 on all other platforms

Installing with conda

Installing without conda

Usage

Example data

Python

Command line

Apptainer

Example of output and interpretations

BD model adequacy test

BDEI model adequacy test

BDSS model adequacy test

About

Releases 3

Packages

Contributors 3

Languages

License

evolbioinfo/phylodeep

Folders and files

Latest commit

History

Repository files navigation

PhyloDeep

Article

Installation

Installation with Python3 on Windows

Installation with Python3 on all other platforms

Installing with conda

Installing without conda

Usage

Example data

Python

Command line

Apptainer

Example of output and interpretations

BD model adequacy test

BDEI model adequacy test

BDSS model adequacy test

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 3

Packages 0

Contributors 3

Languages

Packages