You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The Bioproject ids from NCBI will allow to group samples by their parent project, which is useful. With the current fields it may be possible to infer project structure for a subset of samples (eg EMP500) but I think no general solution.
In theory a few fields from the tsv summary file could solve this -- some overlap in theory with fields (and hopefully values) in the Biosample xml: https://ftp.ncbi.nlm.nih.gov/bioproject/summary.txt
Organism Name TaxID Project Accession Project ID Project Type Project Data Type
(fields in bold are new contributions from Bioproject xml)
The Bioproject ids from NCBI will allow to group samples by their parent project, which is useful. With the current fields it may be possible to infer project structure for a subset of samples (eg EMP500) but I think no general solution.
The good news is that the Bioproject xml file is only 1.8G currently:
https://ftp.ncbi.nlm.nih.gov/bioproject/
In theory a few fields from the tsv summary file could solve this -- some overlap in theory with fields (and hopefully values) in the Biosample xml:
https://ftp.ncbi.nlm.nih.gov/bioproject/summary.txt
Organism Name TaxID Project Accession Project ID Project Type Project Data Type
(fields in bold are new contributions from Bioproject xml)
They also have .xsd schemas for the Bioproject data, not sure if that's useful:
https://ftp.ncbi.nlm.nih.gov/bioproject/Schema.v.1.2/
The text was updated successfully, but these errors were encountered: