Data files for "In Silico Functional Analysis of the Human, Chimpanzee, and Gorilla MHC-A Repertoires" (Kutler Dodd and Keșmir).
- FASTA files for all viruses used in the analysis are found in the Viral_sequences and SIV_sequences (for the SIV-specific analysis) folders.
- CSV files containing the virus dataset progressively reduced with global alignment thresholds (e.g. viruses_reduced_95.csv contains the viruses remaining when >95% sequence similarity viruses are collapsed by randomly selecting one virus in each group) are found in the Reduced_virus_lists folder.
- The human proteome-derived random peptides used to compute percentile rank scores are found in the random_peptides.peptide file.